Re: [Xenomai-core] Pending patches

2009-01-27 Thread Philippe Gerum
Jan Kiszka wrote:
> Philippe Gerum wrote:
>> We should indeed postpone this just in case the upper layer indexes the extra
>> state on the minor value. We can also simplify a few things doing so.
>>
>> --- ksrc/nucleus/pipe.c  (revision 4565)
>> +++ ksrc/nucleus/pipe.c  (working copy)
>> @@ -77,11 +77,9 @@
>>
>>  static inline void xnpipe_minor_free(int minor)
>>  {
>> -if (minor < 0 || minor >= XNPIPE_NDEVS)
>> -return;
>> -
>> -__clrbits(xnpipe_bitmap[minor / BITS_PER_LONG],
>> -  1UL << (minor % BITS_PER_LONG));
>> +/* May be called with nklock free. */
>> +clrbits(xnpipe_bitmap[minor / BITS_PER_LONG],
>> +1UL << (minor % BITS_PER_LONG));
> 
> Bad news: This doesn't fly as is. All modifying operations on
> xnpipe_bitmap must be atomic and xnpipe_bitmap has to be
> xnarch_atomic_t. But then find_first_zero_bit breaks. Is there some
> version for atomic arrays? I guess we have to open-code this, at least
> down to word-level...
>

You mean the op on the whole bitmap, right? Yes, we have a pb.

> Jan
> 


-- 
Philippe.

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] Pending patches

2009-01-27 Thread Gilles Chanteperdrix
Philippe Gerum wrote:
> Jan Kiszka wrote:
>> Philippe Gerum wrote:
>>> We should indeed postpone this just in case the upper layer indexes the 
>>> extra
>>> state on the minor value. We can also simplify a few things doing so.
>>>
>>> --- ksrc/nucleus/pipe.c (revision 4565)
>>> +++ ksrc/nucleus/pipe.c (working copy)
>>> @@ -77,11 +77,9 @@
>>>
>>>  static inline void xnpipe_minor_free(int minor)
>>>  {
>>> -   if (minor < 0 || minor >= XNPIPE_NDEVS)
>>> -   return;
>>> -
>>> -   __clrbits(xnpipe_bitmap[minor / BITS_PER_LONG],
>>> - 1UL << (minor % BITS_PER_LONG));
>>> +   /* May be called with nklock free. */
>>> +   clrbits(xnpipe_bitmap[minor / BITS_PER_LONG],
>>> +   1UL << (minor % BITS_PER_LONG));
>> Bad news: This doesn't fly as is. All modifying operations on
>> xnpipe_bitmap must be atomic and xnpipe_bitmap has to be
>> xnarch_atomic_t. But then find_first_zero_bit breaks. Is there some
>> version for atomic arrays? I guess we have to open-code this, at least
>> down to word-level...
>>
> 
> #define clrbits(flags,mask)  xnarch_atomic_clear_mask(&(flags),mask)
> 
> which is either translated to atomic_clear_mask() on x86 which works on a long
> value, or to any Xenomai-local implementation that work the same way for other
> archs. IOW, clrbits() shall be an atomic op by essence in our implementation.
> 
> The thing is that Linux atomic_set/clear_mask() on x86_64 is broken
> (s,andl,andq), so we may have to provide ours.

well, it may be a feature. Maybe they only need atomic ints.

-- 
 Gilles.

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] Pending patches

2009-01-27 Thread Philippe Gerum
Jan Kiszka wrote:
> Philippe Gerum wrote:
>> We should indeed postpone this just in case the upper layer indexes the extra
>> state on the minor value. We can also simplify a few things doing so.
>>
>> --- ksrc/nucleus/pipe.c  (revision 4565)
>> +++ ksrc/nucleus/pipe.c  (working copy)
>> @@ -77,11 +77,9 @@
>>
>>  static inline void xnpipe_minor_free(int minor)
>>  {
>> -if (minor < 0 || minor >= XNPIPE_NDEVS)
>> -return;
>> -
>> -__clrbits(xnpipe_bitmap[minor / BITS_PER_LONG],
>> -  1UL << (minor % BITS_PER_LONG));
>> +/* May be called with nklock free. */
>> +clrbits(xnpipe_bitmap[minor / BITS_PER_LONG],
>> +1UL << (minor % BITS_PER_LONG));
> 
> Bad news: This doesn't fly as is. All modifying operations on
> xnpipe_bitmap must be atomic and xnpipe_bitmap has to be
> xnarch_atomic_t. But then find_first_zero_bit breaks. Is there some
> version for atomic arrays? I guess we have to open-code this, at least
> down to word-level...
> 

#define clrbits(flags,mask)  xnarch_atomic_clear_mask(&(flags),mask)

which is either translated to atomic_clear_mask() on x86 which works on a long
value, or to any Xenomai-local implementation that work the same way for other
archs. IOW, clrbits() shall be an atomic op by essence in our implementation.

The thing is that Linux atomic_set/clear_mask() on x86_64 is broken
(s,andl,andq), so we may have to provide ours.

> Jan
> 


-- 
Philippe.

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] Pending patches

2009-01-27 Thread Gilles Chanteperdrix
Jan Kiszka wrote:
> Jan Kiszka wrote:
>> Philippe Gerum wrote:
>>> We should indeed postpone this just in case the upper layer indexes the 
>>> extra
>>> state on the minor value. We can also simplify a few things doing so.
>>>
>>> --- ksrc/nucleus/pipe.c (revision 4565)
>>> +++ ksrc/nucleus/pipe.c (working copy)
>>> @@ -77,11 +77,9 @@
>>>
>>>  static inline void xnpipe_minor_free(int minor)
>>>  {
>>> -   if (minor < 0 || minor >= XNPIPE_NDEVS)
>>> -   return;
>>> -
>>> -   __clrbits(xnpipe_bitmap[minor / BITS_PER_LONG],
>>> - 1UL << (minor % BITS_PER_LONG));
>>> +   /* May be called with nklock free. */
>>> +   clrbits(xnpipe_bitmap[minor / BITS_PER_LONG],
>>> +   1UL << (minor % BITS_PER_LONG));
>> Bad news: This doesn't fly as is. All modifying operations on
>> xnpipe_bitmap must be atomic and xnpipe_bitmap has to be
>> xnarch_atomic_t. But then find_first_zero_bit breaks. Is there some
>> version for atomic arrays? I guess we have to open-code this, at least
>> down to word-level...
> 
> Ok, xnpipe_bitmap can remain ulong but
> a) xnpipe_minor_alloc must use setbits and
> b) for some reasons clrbits in xnpipe_minor_free does not build on my 64
> bit host. Maybe compiler issue. Investigating...

Yes, clrbits is atomic_clear_mask which does an andl, so, does not work
for 64 bits arguments.

-- 
 Gilles.

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] Pending patches

2009-01-27 Thread Jan Kiszka
Jan Kiszka wrote:
> Philippe Gerum wrote:
>> We should indeed postpone this just in case the upper layer indexes the extra
>> state on the minor value. We can also simplify a few things doing so.
>>
>> --- ksrc/nucleus/pipe.c  (revision 4565)
>> +++ ksrc/nucleus/pipe.c  (working copy)
>> @@ -77,11 +77,9 @@
>>
>>  static inline void xnpipe_minor_free(int minor)
>>  {
>> -if (minor < 0 || minor >= XNPIPE_NDEVS)
>> -return;
>> -
>> -__clrbits(xnpipe_bitmap[minor / BITS_PER_LONG],
>> -  1UL << (minor % BITS_PER_LONG));
>> +/* May be called with nklock free. */
>> +clrbits(xnpipe_bitmap[minor / BITS_PER_LONG],
>> +1UL << (minor % BITS_PER_LONG));
> 
> Bad news: This doesn't fly as is. All modifying operations on
> xnpipe_bitmap must be atomic and xnpipe_bitmap has to be
> xnarch_atomic_t. But then find_first_zero_bit breaks. Is there some
> version for atomic arrays? I guess we have to open-code this, at least
> down to word-level...

Ok, xnpipe_bitmap can remain ulong but
a) xnpipe_minor_alloc must use setbits and
b) for some reasons clrbits in xnpipe_minor_free does not build on my 64
bit host. Maybe compiler issue. Investigating...

Jan

-- 
Siemens AG, Corporate Technology, CT SE 2
Corporate Competence Center Embedded Linux

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] Pending patches

2009-01-27 Thread Jan Kiszka
Philippe Gerum wrote:
> We should indeed postpone this just in case the upper layer indexes the extra
> state on the minor value. We can also simplify a few things doing so.
> 
> --- ksrc/nucleus/pipe.c   (revision 4565)
> +++ ksrc/nucleus/pipe.c   (working copy)
> @@ -77,11 +77,9 @@
> 
>  static inline void xnpipe_minor_free(int minor)
>  {
> - if (minor < 0 || minor >= XNPIPE_NDEVS)
> - return;
> -
> - __clrbits(xnpipe_bitmap[minor / BITS_PER_LONG],
> -   1UL << (minor % BITS_PER_LONG));
> + /* May be called with nklock free. */
> + clrbits(xnpipe_bitmap[minor / BITS_PER_LONG],
> + 1UL << (minor % BITS_PER_LONG));

Bad news: This doesn't fly as is. All modifying operations on
xnpipe_bitmap must be atomic and xnpipe_bitmap has to be
xnarch_atomic_t. But then find_first_zero_bit breaks. Is there some
version for atomic arrays? I guess we have to open-code this, at least
down to word-level...

Jan

-- 
Siemens AG, Corporate Technology, CT SE 2
Corporate Competence Center Embedded Linux

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] Pending patches

2009-01-20 Thread Philippe Gerum
Jan Kiszka wrote:
> Philippe Gerum wrote:
>> Jan Kiszka wrote:
>>> Philippe Gerum wrote:
 Philippe Gerum wrote:
> Jan Kiszka wrote:
>> Philippe Gerum wrote:
>>> Jan Kiszka wrote:
 Philippe Gerum wrote:
> Jan Kiszka wrote:
>> commit 728fc8970e2032b3280971788f1223f3ad82d80d
>> Author: Jan Kiszka 
>> Date:   Thu Jan 15 11:10:24 2009 +0100
>>
>> xnpipe: Fix racy callback handlers
>> 
>> Invocation of input, output and alloc handler must take place 
>> under
>> nklock to properly synchronize with xnpipe_disconnect. Change all
>> callers to comply with this policy.
>>
> That one is under investigation. I agree on the bug report (thanks 
> btw), but I
> disagree on the fix. Basically, we can't run all hooks under nklock. 
> For
> instance, the alloc_handler may issue kmalloc() calls when issued 
> from the Linux
> write endpoint.
 You mean it /could/? Because no in-tree user (ie. native) calls
 rt-unsafe services from its alloc_handler.

>>> When you export a public interface, it is better not to make it 
>>> incompatible
>>> unless there is no other way to fix a situation. Doing so is last 
>>> resort for me.
>> OTH, there is nothing documented yet about those callback handlers or
>> xnpipe_connect. So we could only break _assumptions_ about this
>> interface.
> Actually, we would break existing implementations, but I understand your 
> point.
>
>  But, of course, I would be happy if we could continue to keep
>> the critical section length short. I just don't see how to achieve this
>> without significant restrictions on the callback handlers and their use
>> cases.
>>
> That is because the semantics of those callouts is not that smart. If we 
> have to
> break the API to solve the locking issue, I want to get the semantics 
> fixed in
> the same move (which may help a lot in solving the locking issue as 
> well), so we
> don't end up with two subsequent breakage.
>
 As suspected, the callback management in the message pipe code had very 
 serious
 issues, including the xnpipe_disconnect / xnpipe_release race. At least:

 - the latency would simply skyrocket when the input queue (i.e. userland to
 kernel) is flushed from xnpipe_cleanup_user_conn, because the latter holds 
 the
 nklock with interrupts off when doing so. I've tested this case on v2.4.x, 
 the
 results on a latency test when the pipe test code is running in parallel 
 are not
 pretty (i.e. > 220 us latency spot).

 - message buffers waiting in the output queue while the latter is flushed 
 would
 NOT be returned to the memory pool when an input handler is present, which 
 is
 the case with the native API that provides one. All the confusion went 
 from the
 output notification and free buffer logic that were combined into the 
 output
 handler. This went unnoticed probably because messages sent via the write()
 endpoint are immediately consumed by the receiving side in a RT task that
 preempts, but still, the bug is there.

 The disconnect vs release race can be fixed in a (nearly) lockless way 
 using a
 lingering close sequence, that prevents xnpipe_disconnect() to wipe out the
 extra state (e.g. RT_PIPE struct) until xnpipe_release() does not need it
 anymore. In short, if the userland side is connected, xnpipe_disconnect() 
 will
 defer the actual release of that extra state to xnpipe_release(); 
 otherwise, it
 will discard it immediately. A new callback, namely .release has been 
 added for
 that purpose.
>>> Just had a short glance so far. Looks good - except that there is still
>>> a use (cookie, now xstate, from xnpipe_state) after release
>>> (xnpipe_minor_free). Can't we push the minor_free after calling the
>>> release handler?
>>>
>> We should indeed postpone this just in case the upper layer indexes the extra
>> state on the minor value. We can also simplify a few things doing so.
>>
>> --- ksrc/nucleus/pipe.c  (revision 4565)
>> +++ ksrc/nucleus/pipe.c  (working copy)
>> @@ -77,11 +77,9 @@
>>
>>  static inline void xnpipe_minor_free(int minor)
>>  {
>> -if (minor < 0 || minor >= XNPIPE_NDEVS)
>> -return;
>> -
>> -__clrbits(xnpipe_bitmap[minor / BITS_PER_LONG],
>> -  1UL << (minor % BITS_PER_LONG));
> 
> xnarch_write_memory_barrier()? Just for the paranoiacs.
>

Yes, we should be that paranoid.

>> +/* May be called with nklock free. */
>> +clrbits(xnpipe_bitmap[minor / BITS_PER_LONG],
>> +1UL << (minor % BITS_PER_LONG));
>>  }
>>
>>  static inline void xnpipe_enqueue_wait(struct xnpipe_state *state, int mask)
>> @@ -413,9 +411,

Re: [Xenomai-core] Pending patches

2009-01-20 Thread Jan Kiszka
Jan Kiszka wrote:
> Philippe Gerum wrote:
>> Jan Kiszka wrote:
>>> Philippe Gerum wrote:
 Philippe Gerum wrote:
> Jan Kiszka wrote:
>> Philippe Gerum wrote:
>>> Jan Kiszka wrote:
 Philippe Gerum wrote:
> Jan Kiszka wrote:
>> commit 728fc8970e2032b3280971788f1223f3ad82d80d
>> Author: Jan Kiszka 
>> Date:   Thu Jan 15 11:10:24 2009 +0100
>>
>> xnpipe: Fix racy callback handlers
>> 
>> Invocation of input, output and alloc handler must take place 
>> under
>> nklock to properly synchronize with xnpipe_disconnect. Change all
>> callers to comply with this policy.
>>
> That one is under investigation. I agree on the bug report (thanks 
> btw), but I
> disagree on the fix. Basically, we can't run all hooks under nklock. 
> For
> instance, the alloc_handler may issue kmalloc() calls when issued 
> from the Linux
> write endpoint.
 You mean it /could/? Because no in-tree user (ie. native) calls
 rt-unsafe services from its alloc_handler.

>>> When you export a public interface, it is better not to make it 
>>> incompatible
>>> unless there is no other way to fix a situation. Doing so is last 
>>> resort for me.
>> OTH, there is nothing documented yet about those callback handlers or
>> xnpipe_connect. So we could only break _assumptions_ about this
>> interface.
> Actually, we would break existing implementations, but I understand your 
> point.
>
>  But, of course, I would be happy if we could continue to keep
>> the critical section length short. I just don't see how to achieve this
>> without significant restrictions on the callback handlers and their use
>> cases.
>>
> That is because the semantics of those callouts is not that smart. If we 
> have to
> break the API to solve the locking issue, I want to get the semantics 
> fixed in
> the same move (which may help a lot in solving the locking issue as 
> well), so we
> don't end up with two subsequent breakage.
>
 As suspected, the callback management in the message pipe code had very 
 serious
 issues, including the xnpipe_disconnect / xnpipe_release race. At least:

 - the latency would simply skyrocket when the input queue (i.e. userland to
 kernel) is flushed from xnpipe_cleanup_user_conn, because the latter holds 
 the
 nklock with interrupts off when doing so. I've tested this case on v2.4.x, 
 the
 results on a latency test when the pipe test code is running in parallel 
 are not
 pretty (i.e. > 220 us latency spot).

 - message buffers waiting in the output queue while the latter is flushed 
 would
 NOT be returned to the memory pool when an input handler is present, which 
 is
 the case with the native API that provides one. All the confusion went 
 from the
 output notification and free buffer logic that were combined into the 
 output
 handler. This went unnoticed probably because messages sent via the write()
 endpoint are immediately consumed by the receiving side in a RT task that
 preempts, but still, the bug is there.

 The disconnect vs release race can be fixed in a (nearly) lockless way 
 using a
 lingering close sequence, that prevents xnpipe_disconnect() to wipe out the
 extra state (e.g. RT_PIPE struct) until xnpipe_release() does not need it
 anymore. In short, if the userland side is connected, xnpipe_disconnect() 
 will
 defer the actual release of that extra state to xnpipe_release(); 
 otherwise, it
 will discard it immediately. A new callback, namely .release has been 
 added for
 that purpose.
>>> Just had a short glance so far. Looks good - except that there is still
>>> a use (cookie, now xstate, from xnpipe_state) after release
>>> (xnpipe_minor_free). Can't we push the minor_free after calling the
>>> release handler?
>>>
>> We should indeed postpone this just in case the upper layer indexes the extra
>> state on the minor value. We can also simplify a few things doing so.
>>
>> --- ksrc/nucleus/pipe.c  (revision 4565)
>> +++ ksrc/nucleus/pipe.c  (working copy)
>> @@ -77,11 +77,9 @@
>>
>>  static inline void xnpipe_minor_free(int minor)
>>  {
>> -if (minor < 0 || minor >= XNPIPE_NDEVS)
>> -return;
>> -
>> -__clrbits(xnpipe_bitmap[minor / BITS_PER_LONG],
>> -  1UL << (minor % BITS_PER_LONG));
> 
> xnarch_write_memory_barrier()? Just for the paranoiacs.

Actually, xnarch_memory_barrier(), reading (eg. from xstate) should
better be finished by then as well.

Jan

-- 
Siemens AG, Corporate Technology, CT SE 2
Corporate Competence Center Embedded Linux

___
Xenomai-core mailing list
Xenomai-core@gna.or

Re: [Xenomai-core] Pending patches

2009-01-20 Thread Jan Kiszka
Philippe Gerum wrote:
> Jan Kiszka wrote:
>> Philippe Gerum wrote:
>>> Philippe Gerum wrote:
 Jan Kiszka wrote:
> Philippe Gerum wrote:
>> Jan Kiszka wrote:
>>> Philippe Gerum wrote:
 Jan Kiszka wrote:
> commit 728fc8970e2032b3280971788f1223f3ad82d80d
> Author: Jan Kiszka 
> Date:   Thu Jan 15 11:10:24 2009 +0100
>
> xnpipe: Fix racy callback handlers
> 
> Invocation of input, output and alloc handler must take place 
> under
> nklock to properly synchronize with xnpipe_disconnect. Change all
> callers to comply with this policy.
>
 That one is under investigation. I agree on the bug report (thanks 
 btw), but I
 disagree on the fix. Basically, we can't run all hooks under nklock. 
 For
 instance, the alloc_handler may issue kmalloc() calls when issued from 
 the Linux
 write endpoint.
>>> You mean it /could/? Because no in-tree user (ie. native) calls
>>> rt-unsafe services from its alloc_handler.
>>>
>> When you export a public interface, it is better not to make it 
>> incompatible
>> unless there is no other way to fix a situation. Doing so is last resort 
>> for me.
> OTH, there is nothing documented yet about those callback handlers or
> xnpipe_connect. So we could only break _assumptions_ about this
> interface.
 Actually, we would break existing implementations, but I understand your 
 point.

  But, of course, I would be happy if we could continue to keep
> the critical section length short. I just don't see how to achieve this
> without significant restrictions on the callback handlers and their use
> cases.
>
 That is because the semantics of those callouts is not that smart. If we 
 have to
 break the API to solve the locking issue, I want to get the semantics 
 fixed in
 the same move (which may help a lot in solving the locking issue as well), 
 so we
 don't end up with two subsequent breakage.

>>> As suspected, the callback management in the message pipe code had very 
>>> serious
>>> issues, including the xnpipe_disconnect / xnpipe_release race. At least:
>>>
>>> - the latency would simply skyrocket when the input queue (i.e. userland to
>>> kernel) is flushed from xnpipe_cleanup_user_conn, because the latter holds 
>>> the
>>> nklock with interrupts off when doing so. I've tested this case on v2.4.x, 
>>> the
>>> results on a latency test when the pipe test code is running in parallel 
>>> are not
>>> pretty (i.e. > 220 us latency spot).
>>>
>>> - message buffers waiting in the output queue while the latter is flushed 
>>> would
>>> NOT be returned to the memory pool when an input handler is present, which 
>>> is
>>> the case with the native API that provides one. All the confusion went from 
>>> the
>>> output notification and free buffer logic that were combined into the output
>>> handler. This went unnoticed probably because messages sent via the write()
>>> endpoint are immediately consumed by the receiving side in a RT task that
>>> preempts, but still, the bug is there.
>>>
>>> The disconnect vs release race can be fixed in a (nearly) lockless way 
>>> using a
>>> lingering close sequence, that prevents xnpipe_disconnect() to wipe out the
>>> extra state (e.g. RT_PIPE struct) until xnpipe_release() does not need it
>>> anymore. In short, if the userland side is connected, xnpipe_disconnect() 
>>> will
>>> defer the actual release of that extra state to xnpipe_release(); 
>>> otherwise, it
>>> will discard it immediately. A new callback, namely .release has been added 
>>> for
>>> that purpose.
>> Just had a short glance so far. Looks good - except that there is still
>> a use (cookie, now xstate, from xnpipe_state) after release
>> (xnpipe_minor_free). Can't we push the minor_free after calling the
>> release handler?
>>
> 
> We should indeed postpone this just in case the upper layer indexes the extra
> state on the minor value. We can also simplify a few things doing so.
> 
> --- ksrc/nucleus/pipe.c   (revision 4565)
> +++ ksrc/nucleus/pipe.c   (working copy)
> @@ -77,11 +77,9 @@
> 
>  static inline void xnpipe_minor_free(int minor)
>  {
> - if (minor < 0 || minor >= XNPIPE_NDEVS)
> - return;
> -
> - __clrbits(xnpipe_bitmap[minor / BITS_PER_LONG],
> -   1UL << (minor % BITS_PER_LONG));

xnarch_write_memory_barrier()? Just for the paranoiacs.

> + /* May be called with nklock free. */
> + clrbits(xnpipe_bitmap[minor / BITS_PER_LONG],
> + 1UL << (minor % BITS_PER_LONG));
>  }
> 
>  static inline void xnpipe_enqueue_wait(struct xnpipe_state *state, int mask)
> @@ -413,9 +411,9 @@
>   __setbits(state->status, XNPIPE_KERN_LCLOSE);
>   xnlock_put_irqrestore(&nklock, s);
>   } else {
> - xnpipe_minor_

Re: [Xenomai-core] Pending patches

2009-01-20 Thread Philippe Gerum
Jan Kiszka wrote:
> Philippe Gerum wrote:
>> Philippe Gerum wrote:
>>> Jan Kiszka wrote:
 Philippe Gerum wrote:
> Jan Kiszka wrote:
>> Philippe Gerum wrote:
>>> Jan Kiszka wrote:
 commit 728fc8970e2032b3280971788f1223f3ad82d80d
 Author: Jan Kiszka 
 Date:   Thu Jan 15 11:10:24 2009 +0100

 xnpipe: Fix racy callback handlers
 
 Invocation of input, output and alloc handler must take place under
 nklock to properly synchronize with xnpipe_disconnect. Change all
 callers to comply with this policy.

>>> That one is under investigation. I agree on the bug report (thanks 
>>> btw), but I
>>> disagree on the fix. Basically, we can't run all hooks under nklock. For
>>> instance, the alloc_handler may issue kmalloc() calls when issued from 
>>> the Linux
>>> write endpoint.
>> You mean it /could/? Because no in-tree user (ie. native) calls
>> rt-unsafe services from its alloc_handler.
>>
> When you export a public interface, it is better not to make it 
> incompatible
> unless there is no other way to fix a situation. Doing so is last resort 
> for me.
 OTH, there is nothing documented yet about those callback handlers or
 xnpipe_connect. So we could only break _assumptions_ about this
 interface.
>>> Actually, we would break existing implementations, but I understand your 
>>> point.
>>>
>>>  But, of course, I would be happy if we could continue to keep
 the critical section length short. I just don't see how to achieve this
 without significant restrictions on the callback handlers and their use
 cases.

>>> That is because the semantics of those callouts is not that smart. If we 
>>> have to
>>> break the API to solve the locking issue, I want to get the semantics fixed 
>>> in
>>> the same move (which may help a lot in solving the locking issue as well), 
>>> so we
>>> don't end up with two subsequent breakage.
>>>
>> As suspected, the callback management in the message pipe code had very 
>> serious
>> issues, including the xnpipe_disconnect / xnpipe_release race. At least:
>>
>> - the latency would simply skyrocket when the input queue (i.e. userland to
>> kernel) is flushed from xnpipe_cleanup_user_conn, because the latter holds 
>> the
>> nklock with interrupts off when doing so. I've tested this case on v2.4.x, 
>> the
>> results on a latency test when the pipe test code is running in parallel are 
>> not
>> pretty (i.e. > 220 us latency spot).
>>
>> - message buffers waiting in the output queue while the latter is flushed 
>> would
>> NOT be returned to the memory pool when an input handler is present, which is
>> the case with the native API that provides one. All the confusion went from 
>> the
>> output notification and free buffer logic that were combined into the output
>> handler. This went unnoticed probably because messages sent via the write()
>> endpoint are immediately consumed by the receiving side in a RT task that
>> preempts, but still, the bug is there.
>>
>> The disconnect vs release race can be fixed in a (nearly) lockless way using 
>> a
>> lingering close sequence, that prevents xnpipe_disconnect() to wipe out the
>> extra state (e.g. RT_PIPE struct) until xnpipe_release() does not need it
>> anymore. In short, if the userland side is connected, xnpipe_disconnect() 
>> will
>> defer the actual release of that extra state to xnpipe_release(); otherwise, 
>> it
>> will discard it immediately. A new callback, namely .release has been added 
>> for
>> that purpose.
> 
> Just had a short glance so far. Looks good - except that there is still
> a use (cookie, now xstate, from xnpipe_state) after release
> (xnpipe_minor_free). Can't we push the minor_free after calling the
> release handler?
>

We should indeed postpone this just in case the upper layer indexes the extra
state on the minor value. We can also simplify a few things doing so.

--- ksrc/nucleus/pipe.c (revision 4565)
+++ ksrc/nucleus/pipe.c (working copy)
@@ -77,11 +77,9 @@

 static inline void xnpipe_minor_free(int minor)
 {
-   if (minor < 0 || minor >= XNPIPE_NDEVS)
-   return;
-
-   __clrbits(xnpipe_bitmap[minor / BITS_PER_LONG],
- 1UL << (minor % BITS_PER_LONG));
+   /* May be called with nklock free. */
+   clrbits(xnpipe_bitmap[minor / BITS_PER_LONG],
+   1UL << (minor % BITS_PER_LONG));
 }

 static inline void xnpipe_enqueue_wait(struct xnpipe_state *state, int mask)
@@ -413,9 +411,9 @@
__setbits(state->status, XNPIPE_KERN_LCLOSE);
xnlock_put_irqrestore(&nklock, s);
} else {
-   xnpipe_minor_free(minor);
xnlock_put_irqrestore(&nklock, s);
state->ops.release(state->xstate);
+   xnpipe_minor_free(minor);
}

if (need_sched)
@@ -621,9 +619,9 @@
__clrbi

Re: [Xenomai-core] Pending patches

2009-01-19 Thread Jan Kiszka
Philippe Gerum wrote:
> Philippe Gerum wrote:
>> Jan Kiszka wrote:
>>> Philippe Gerum wrote:
 Jan Kiszka wrote:
> Philippe Gerum wrote:
>> Jan Kiszka wrote:
>>> commit 728fc8970e2032b3280971788f1223f3ad82d80d
>>> Author: Jan Kiszka 
>>> Date:   Thu Jan 15 11:10:24 2009 +0100
>>>
>>> xnpipe: Fix racy callback handlers
>>> 
>>> Invocation of input, output and alloc handler must take place under
>>> nklock to properly synchronize with xnpipe_disconnect. Change all
>>> callers to comply with this policy.
>>>
>> That one is under investigation. I agree on the bug report (thanks btw), 
>> but I
>> disagree on the fix. Basically, we can't run all hooks under nklock. For
>> instance, the alloc_handler may issue kmalloc() calls when issued from 
>> the Linux
>> write endpoint.
> You mean it /could/? Because no in-tree user (ie. native) calls
> rt-unsafe services from its alloc_handler.
>
 When you export a public interface, it is better not to make it 
 incompatible
 unless there is no other way to fix a situation. Doing so is last resort 
 for me.
>>> OTH, there is nothing documented yet about those callback handlers or
>>> xnpipe_connect. So we could only break _assumptions_ about this
>>> interface.
>> Actually, we would break existing implementations, but I understand your 
>> point.
>>
>>  But, of course, I would be happy if we could continue to keep
>>> the critical section length short. I just don't see how to achieve this
>>> without significant restrictions on the callback handlers and their use
>>> cases.
>>>
>> That is because the semantics of those callouts is not that smart. If we 
>> have to
>> break the API to solve the locking issue, I want to get the semantics fixed 
>> in
>> the same move (which may help a lot in solving the locking issue as well), 
>> so we
>> don't end up with two subsequent breakage.
>>
> 
> As suspected, the callback management in the message pipe code had very 
> serious
> issues, including the xnpipe_disconnect / xnpipe_release race. At least:
> 
> - the latency would simply skyrocket when the input queue (i.e. userland to
> kernel) is flushed from xnpipe_cleanup_user_conn, because the latter holds the
> nklock with interrupts off when doing so. I've tested this case on v2.4.x, the
> results on a latency test when the pipe test code is running in parallel are 
> not
> pretty (i.e. > 220 us latency spot).
> 
> - message buffers waiting in the output queue while the latter is flushed 
> would
> NOT be returned to the memory pool when an input handler is present, which is
> the case with the native API that provides one. All the confusion went from 
> the
> output notification and free buffer logic that were combined into the output
> handler. This went unnoticed probably because messages sent via the write()
> endpoint are immediately consumed by the receiving side in a RT task that
> preempts, but still, the bug is there.
> 
> The disconnect vs release race can be fixed in a (nearly) lockless way using a
> lingering close sequence, that prevents xnpipe_disconnect() to wipe out the
> extra state (e.g. RT_PIPE struct) until xnpipe_release() does not need it
> anymore. In short, if the userland side is connected, xnpipe_disconnect() will
> defer the actual release of that extra state to xnpipe_release(); otherwise, 
> it
> will discard it immediately. A new callback, namely .release has been added 
> for
> that purpose.

Just had a short glance so far. Looks good - except that there is still
a use (cookie, now xstate, from xnpipe_state) after release
(xnpipe_minor_free). Can't we push the minor_free after calling the
release handler?

> 
> The internal xnpipe API had to be fixed with respect to the new set of 
> callbacks
> it accepts, but the overall logic remained the same. The handlers were simply
> made more consistent with what the message pipe machinery expects from them.
> There is no user visible change.
> 
> I have committed the changes that fix all the issues above; the new code
> survived significant stress testing here, which only means that this seems to 
> be
> a reasonable framework to start working with.

Will see that I can quickly integrate this in our environment to broaden
the test base.

> 
> Given the severity of the bugs, I definitely plan to backport those changes to
> v2.4.x at some point (the earlier, the better), since we could not reasonably
> live with such issues in a stable version.
> 

Yep, would be good.

Thanks,
Jan



signature.asc
Description: OpenPGP digital signature
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] Pending patches

2009-01-19 Thread Philippe Gerum
Philippe Gerum wrote:
> Jan Kiszka wrote:
>> Philippe Gerum wrote:
>>> Jan Kiszka wrote:
 Philippe Gerum wrote:
> Jan Kiszka wrote:
>> commit 728fc8970e2032b3280971788f1223f3ad82d80d
>> Author: Jan Kiszka 
>> Date:   Thu Jan 15 11:10:24 2009 +0100
>>
>> xnpipe: Fix racy callback handlers
>> 
>> Invocation of input, output and alloc handler must take place under
>> nklock to properly synchronize with xnpipe_disconnect. Change all
>> callers to comply with this policy.
>>
> That one is under investigation. I agree on the bug report (thanks btw), 
> but I
> disagree on the fix. Basically, we can't run all hooks under nklock. For
> instance, the alloc_handler may issue kmalloc() calls when issued from 
> the Linux
> write endpoint.
 You mean it /could/? Because no in-tree user (ie. native) calls
 rt-unsafe services from its alloc_handler.

>>> When you export a public interface, it is better not to make it incompatible
>>> unless there is no other way to fix a situation. Doing so is last resort 
>>> for me.
>> OTH, there is nothing documented yet about those callback handlers or
>> xnpipe_connect. So we could only break _assumptions_ about this
>> interface.
> 
> Actually, we would break existing implementations, but I understand your 
> point.
> 
>  But, of course, I would be happy if we could continue to keep
>> the critical section length short. I just don't see how to achieve this
>> without significant restrictions on the callback handlers and their use
>> cases.
>>
> 
> That is because the semantics of those callouts is not that smart. If we have 
> to
> break the API to solve the locking issue, I want to get the semantics fixed in
> the same move (which may help a lot in solving the locking issue as well), so 
> we
> don't end up with two subsequent breakage.
> 

As suspected, the callback management in the message pipe code had very serious
issues, including the xnpipe_disconnect / xnpipe_release race. At least:

- the latency would simply skyrocket when the input queue (i.e. userland to
kernel) is flushed from xnpipe_cleanup_user_conn, because the latter holds the
nklock with interrupts off when doing so. I've tested this case on v2.4.x, the
results on a latency test when the pipe test code is running in parallel are not
pretty (i.e. > 220 us latency spot).

- message buffers waiting in the output queue while the latter is flushed would
NOT be returned to the memory pool when an input handler is present, which is
the case with the native API that provides one. All the confusion went from the
output notification and free buffer logic that were combined into the output
handler. This went unnoticed probably because messages sent via the write()
endpoint are immediately consumed by the receiving side in a RT task that
preempts, but still, the bug is there.

The disconnect vs release race can be fixed in a (nearly) lockless way using a
lingering close sequence, that prevents xnpipe_disconnect() to wipe out the
extra state (e.g. RT_PIPE struct) until xnpipe_release() does not need it
anymore. In short, if the userland side is connected, xnpipe_disconnect() will
defer the actual release of that extra state to xnpipe_release(); otherwise, it
will discard it immediately. A new callback, namely .release has been added for
that purpose.

The internal xnpipe API had to be fixed with respect to the new set of callbacks
it accepts, but the overall logic remained the same. The handlers were simply
made more consistent with what the message pipe machinery expects from them.
There is no user visible change.

I have committed the changes that fix all the issues above; the new code
survived significant stress testing here, which only means that this seems to be
a reasonable framework to start working with.

Given the severity of the bugs, I definitely plan to backport those changes to
v2.4.x at some point (the earlier, the better), since we could not reasonably
live with such issues in a stable version.

-- 
Philippe.

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] Pending patches

2009-01-15 Thread Jan Kiszka
Jan Kiszka wrote:
> Gilles Chanteperdrix wrote:
>> Jan Kiszka wrote:
>>> Gilles Chanteperdrix wrote:
 Jan Kiszka wrote:
> commit 028d4766a38b6937d9a2c02a20022e3ee5b67b55
> Author: Jan Kiszka 
> Date:   Thu Jan 15 11:10:24 2009 +0100
>
> POSIX: Fix initialization of SCHED_RR threads
> 
> Passing SCHED_RR as policy to pthread_create has currently not the
> desired effect. The kernel part expects that user space adjusts the
> policy and prio via __pse51_thread_setschedparam after setting up the
> shadow. And this is what the patch does by calling the wrapped
> pthread_setschedparam instead of the real one.
> 
> Signed-off-by: Jan Kiszka 
>
>  src/skins/posix/thread.c |2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
>
> commit 71666ce04ef216d281fe86ee82a5560c2b57c6dd
> Author: Jan Kiszka 
> Date:   Thu Jan 15 11:10:24 2009 +0100
>
> Handle priority changes of SCHED_RR tasks
> 
> If shadowed Linux tasks with SCHED_RR policy change their priority,
> do_setsched_event currenty ignores this. Extend the condition to catch
> this case as well.
> 
> Signed-off-by: Jan Kiszka 
>
>  ksrc/nucleus/shadow.c |2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
>
 Nack these two ones too. Philippe implemented a SCHED_RR working over
 aperiodic mode. I think the POSIX skin needs fixing, but not that way.
>>> Then please suggest a better fix.
>> I thought I did: simply pass the SCHED_RR option to kernel-space and
>> handle it there, but replace it with SCHED_FIFO for anything in
>> user-space. I plan to do it, but trunk is not my current priority.
> 
> This is also a stable bug (so the final version should also be
> backported). However, I will check your proposal.

Extending the __pse51_thread_create syscall to also take the sched
policy is likely no option to fix 2.4.x -- ABI breakage...

Suggestion: Apply my original fix to stable but go the enhanced
__pse51_thread_create path for trunk (I'm working on the latter ATM).

Jan



signature.asc
Description: OpenPGP digital signature
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] Pending patches

2009-01-15 Thread Jan Kiszka
Gilles Chanteperdrix wrote:
> Jan Kiszka wrote:
>> Gilles Chanteperdrix wrote:
>>> Jan Kiszka wrote:
 Hi,

 currently I have the following six patches in my assorted queue
 (git://git.kiszka.org/xenomai.git queue/assorted). All have been posted
 before, I just rebased them since then a few times. Should I repost
 any/all of them (would be no problem), or are some already queued for
 potential merge?

 Jan

 (...)
 commit a631ab2c531d5e381ba8a0a59bf301a0276d9f99
 Author: Jan Kiszka 
 Date:   Thu Jan 15 11:10:24 2009 +0100

 POSIX: Do not auto-shadow main with dlopen enabled
 
 Don't perform auto-shadowing in POSIX skin if we might be loaded via
 dlopen. Otherwise the wrong thread, the undefined dlopen caller, may be
 (re-)shadowed, assigning wrong scheduling settings.
 
 Signed-off-by: Jan Kiszka 

  src/skins/posix/init.c |   43 ---
  1 files changed, 28 insertions(+), 15 deletions(-)

 commit 91ae3da822ca558804bf33be4d164ea4c2667c1b
 Author: Jan Kiszka 
 Date:   Thu Jan 15 11:10:24 2009 +0100

 Replace --without-__tread with --enable-dlopen-skins
 
 In practice, you only want to disable __thread support when Xenomai 
 skin
 libraries should be loadable via dlopen. Therefore rename the related
 configure switch accordingly.
 
 Signed-off-by: Jan Kiszka 

>>> Nack these two ones: one of the gcc bugs I have found on ARM is related
>>> to __thread (http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38815). So,
>>> even though I have found a work around for the bug, I am not sure that
>>> it works all the time, so disabling __thread on ARM is safer. Especially
>>> since __thread does not bring any performance improvement over
>>> pthread_(get|set)specific on ARM.
>>>
 commit 028d4766a38b6937d9a2c02a20022e3ee5b67b55
 Author: Jan Kiszka 
 Date:   Thu Jan 15 11:10:24 2009 +0100

 POSIX: Fix initialization of SCHED_RR threads
 
 Passing SCHED_RR as policy to pthread_create has currently not the
 desired effect. The kernel part expects that user space adjusts the
 policy and prio via __pse51_thread_setschedparam after setting up the
 shadow. And this is what the patch does by calling the wrapped
 pthread_setschedparam instead of the real one.
 
 Signed-off-by: Jan Kiszka 

  src/skins/posix/thread.c |2 +-
  1 files changed, 1 insertions(+), 1 deletions(-)

 commit 71666ce04ef216d281fe86ee82a5560c2b57c6dd
 Author: Jan Kiszka 
 Date:   Thu Jan 15 11:10:24 2009 +0100

 Handle priority changes of SCHED_RR tasks
 
 If shadowed Linux tasks with SCHED_RR policy change their priority,
 do_setsched_event currenty ignores this. Extend the condition to catch
 this case as well.
 
 Signed-off-by: Jan Kiszka 

  ksrc/nucleus/shadow.c |2 +-
  1 files changed, 1 insertions(+), 1 deletions(-)

>>> Nack these two ones too. Philippe implemented a SCHED_RR working over
>>> aperiodic mode. I think the POSIX skin needs fixing, but not that way.
>> Then please suggest a better fix.
> 
> I thought I did: simply pass the SCHED_RR option to kernel-space and
> handle it there, but replace it with SCHED_FIFO for anything in
> user-space. I plan to do it, but trunk is not my current priority.

This is also a stable bug (so the final version should also be
backported). However, I will check your proposal.

> 
>>> We should have the thread run with SCHED_FIFO in secondary mode even if
>>> it runs under round-robin in primary mode.
>> As I explained, that can always be a weak policy. No one can prevent the
>> user from setting his Linux thread to SCHED_RR.
>>
>>> And when we have done that,
>>> we do not need the second patch, since a shadow linux task will always
>>> run with SCHED_FIFO.
>>>
>> Only by convention, not by feasible design (unless you want to change
>> the kernel in this respect).
> 
> I think the sane usage of Xenomai is to rely on Xenomai services, not to
> use __real_pthread_setschedparam to set the Linux side priority of a
> real-time thread.

Well, the point was weighing what we loose by extending the check
(nothing IMHO, we can still apply the above policy to the wrapper)
against what we gain (catching also the case where the user
intentionally chooses this policy).

Jan

-- 
Siemens AG, Corporate Technology, CT SE 26
Corporate Competence Center Embedded Linux

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] Pending patches

2009-01-15 Thread Philippe Gerum
Jan Kiszka wrote:
> Philippe Gerum wrote:
>> Jan Kiszka wrote:
>>> Philippe Gerum wrote:
 Jan Kiszka wrote:
> commit 728fc8970e2032b3280971788f1223f3ad82d80d
> Author: Jan Kiszka 
> Date:   Thu Jan 15 11:10:24 2009 +0100
>
> xnpipe: Fix racy callback handlers
> 
> Invocation of input, output and alloc handler must take place under
> nklock to properly synchronize with xnpipe_disconnect. Change all
> callers to comply with this policy.
>
 That one is under investigation. I agree on the bug report (thanks btw), 
 but I
 disagree on the fix. Basically, we can't run all hooks under nklock. For
 instance, the alloc_handler may issue kmalloc() calls when issued from the 
 Linux
 write endpoint.
>>> You mean it /could/? Because no in-tree user (ie. native) calls
>>> rt-unsafe services from its alloc_handler.
>>>
>> When you export a public interface, it is better not to make it incompatible
>> unless there is no other way to fix a situation. Doing so is last resort for 
>> me.
> 
> OTH, there is nothing documented yet about those callback handlers or
> xnpipe_connect. So we could only break _assumptions_ about this
> interface.

Actually, we would break existing implementations, but I understand your point.

 But, of course, I would be happy if we could continue to keep
> the critical section length short. I just don't see how to achieve this
> without significant restrictions on the callback handlers and their use
> cases.
>

That is because the semantics of those callouts is not that smart. If we have to
break the API to solve the locking issue, I want to get the semantics fixed in
the same move (which may help a lot in solving the locking issue as well), so we
don't end up with two subsequent breakage.

> Jan
> 


-- 
Philippe.

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] Pending patches

2009-01-15 Thread Jan Kiszka
Gilles Chanteperdrix wrote:
> Jan Kiszka wrote:
>> Gilles Chanteperdrix wrote:
>>> Gilles Chanteperdrix wrote:
 Jan Kiszka wrote:
> Hi,
>
> currently I have the following six patches in my assorted queue
> (git://git.kiszka.org/xenomai.git queue/assorted). All have been posted
> before, I just rebased them since then a few times. Should I repost
> any/all of them (would be no problem), or are some already queued for
> potential merge?
>
> Jan
>
> (...)
> commit a631ab2c531d5e381ba8a0a59bf301a0276d9f99
> Author: Jan Kiszka 
> Date:   Thu Jan 15 11:10:24 2009 +0100
>
> POSIX: Do not auto-shadow main with dlopen enabled
> 
> Don't perform auto-shadowing in POSIX skin if we might be loaded via
> dlopen. Otherwise the wrong thread, the undefined dlopen caller, may 
> be
> (re-)shadowed, assigning wrong scheduling settings.
> 
> Signed-off-by: Jan Kiszka 
>
>  src/skins/posix/init.c |   43 ---
>  1 files changed, 28 insertions(+), 15 deletions(-)
>
> commit 91ae3da822ca558804bf33be4d164ea4c2667c1b
> Author: Jan Kiszka 
> Date:   Thu Jan 15 11:10:24 2009 +0100
>
> Replace --without-__tread with --enable-dlopen-skins
> 
> In practice, you only want to disable __thread support when Xenomai 
> skin
> libraries should be loadable via dlopen. Therefore rename the related
> configure switch accordingly.
> 
> Signed-off-by: Jan Kiszka 
>
 Nack these two ones: one of the gcc bugs I have found on ARM is related
 to __thread (http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38815). So,
 even though I have found a work around for the bug, I am not sure that
 it works all the time, so disabling __thread on ARM is safer. Especially
 since __thread does not bring any performance improvement over
 pthread_(get|set)specific on ARM.
>>> Actually, I am not opposed to the first, only to the second.
>> OK, will you or should I apply the do-not-auto-shadow patch?
>>
>> Regarding the second one: If the option should continue to be named
>> --without-__thread (because of ARM), then we need at the bare minimum a
>> documentation of the correlation with dlopen. BTW, any concerns to apply
>> the "Mark libs nodlopen" fix?
> 
> Maybe we could add the dlopen option which would disable shadowing of
> the main thread and unselect __thread ?

Sounds good, will do this.

Jan

-- 
Siemens AG, Corporate Technology, CT SE 26
Corporate Competence Center Embedded Linux

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] Pending patches

2009-01-15 Thread Jan Kiszka
Philippe Gerum wrote:
> Jan Kiszka wrote:
>> Philippe Gerum wrote:
>>> Jan Kiszka wrote:
 commit 728fc8970e2032b3280971788f1223f3ad82d80d
 Author: Jan Kiszka 
 Date:   Thu Jan 15 11:10:24 2009 +0100

 xnpipe: Fix racy callback handlers
 
 Invocation of input, output and alloc handler must take place under
 nklock to properly synchronize with xnpipe_disconnect. Change all
 callers to comply with this policy.

>>> That one is under investigation. I agree on the bug report (thanks btw), 
>>> but I
>>> disagree on the fix. Basically, we can't run all hooks under nklock. For
>>> instance, the alloc_handler may issue kmalloc() calls when issued from the 
>>> Linux
>>> write endpoint.
>> You mean it /could/? Because no in-tree user (ie. native) calls
>> rt-unsafe services from its alloc_handler.
>>
> 
> When you export a public interface, it is better not to make it incompatible
> unless there is no other way to fix a situation. Doing so is last resort for 
> me.

OTH, there is nothing documented yet about those callback handlers or
xnpipe_connect. So we could only break _assumptions_ about this
interface. But, of course, I would be happy if we could continue to keep
the critical section length short. I just don't see how to achieve this
without significant restrictions on the callback handlers and their use
cases.

Jan

-- 
Siemens AG, Corporate Technology, CT SE 26
Corporate Competence Center Embedded Linux

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] Pending patches

2009-01-15 Thread Gilles Chanteperdrix
Jan Kiszka wrote:
> Gilles Chanteperdrix wrote:
>> Jan Kiszka wrote:
>>> Hi,
>>>
>>> currently I have the following six patches in my assorted queue
>>> (git://git.kiszka.org/xenomai.git queue/assorted). All have been posted
>>> before, I just rebased them since then a few times. Should I repost
>>> any/all of them (would be no problem), or are some already queued for
>>> potential merge?
>>>
>>> Jan
>>>
>>> (...)
>>> commit a631ab2c531d5e381ba8a0a59bf301a0276d9f99
>>> Author: Jan Kiszka 
>>> Date:   Thu Jan 15 11:10:24 2009 +0100
>>>
>>> POSIX: Do not auto-shadow main with dlopen enabled
>>> 
>>> Don't perform auto-shadowing in POSIX skin if we might be loaded via
>>> dlopen. Otherwise the wrong thread, the undefined dlopen caller, may be
>>> (re-)shadowed, assigning wrong scheduling settings.
>>> 
>>> Signed-off-by: Jan Kiszka 
>>>
>>>  src/skins/posix/init.c |   43 ---
>>>  1 files changed, 28 insertions(+), 15 deletions(-)
>>>
>>> commit 91ae3da822ca558804bf33be4d164ea4c2667c1b
>>> Author: Jan Kiszka 
>>> Date:   Thu Jan 15 11:10:24 2009 +0100
>>>
>>> Replace --without-__tread with --enable-dlopen-skins
>>> 
>>> In practice, you only want to disable __thread support when Xenomai skin
>>> libraries should be loadable via dlopen. Therefore rename the related
>>> configure switch accordingly.
>>> 
>>> Signed-off-by: Jan Kiszka 
>>>
>> Nack these two ones: one of the gcc bugs I have found on ARM is related
>> to __thread (http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38815). So,
>> even though I have found a work around for the bug, I am not sure that
>> it works all the time, so disabling __thread on ARM is safer. Especially
>> since __thread does not bring any performance improvement over
>> pthread_(get|set)specific on ARM.
>>
>>> commit 028d4766a38b6937d9a2c02a20022e3ee5b67b55
>>> Author: Jan Kiszka 
>>> Date:   Thu Jan 15 11:10:24 2009 +0100
>>>
>>> POSIX: Fix initialization of SCHED_RR threads
>>> 
>>> Passing SCHED_RR as policy to pthread_create has currently not the
>>> desired effect. The kernel part expects that user space adjusts the
>>> policy and prio via __pse51_thread_setschedparam after setting up the
>>> shadow. And this is what the patch does by calling the wrapped
>>> pthread_setschedparam instead of the real one.
>>> 
>>> Signed-off-by: Jan Kiszka 
>>>
>>>  src/skins/posix/thread.c |2 +-
>>>  1 files changed, 1 insertions(+), 1 deletions(-)
>>>
>>> commit 71666ce04ef216d281fe86ee82a5560c2b57c6dd
>>> Author: Jan Kiszka 
>>> Date:   Thu Jan 15 11:10:24 2009 +0100
>>>
>>> Handle priority changes of SCHED_RR tasks
>>> 
>>> If shadowed Linux tasks with SCHED_RR policy change their priority,
>>> do_setsched_event currenty ignores this. Extend the condition to catch
>>> this case as well.
>>> 
>>> Signed-off-by: Jan Kiszka 
>>>
>>>  ksrc/nucleus/shadow.c |2 +-
>>>  1 files changed, 1 insertions(+), 1 deletions(-)
>>>
>> Nack these two ones too. Philippe implemented a SCHED_RR working over
>> aperiodic mode. I think the POSIX skin needs fixing, but not that way.
> 
> Then please suggest a better fix.

I thought I did: simply pass the SCHED_RR option to kernel-space and
handle it there, but replace it with SCHED_FIFO for anything in
user-space. I plan to do it, but trunk is not my current priority.

> 
>> We should have the thread run with SCHED_FIFO in secondary mode even if
>> it runs under round-robin in primary mode.
> 
> As I explained, that can always be a weak policy. No one can prevent the
> user from setting his Linux thread to SCHED_RR.
> 
>> And when we have done that,
>> we do not need the second patch, since a shadow linux task will always
>> run with SCHED_FIFO.
>>
> 
> Only by convention, not by feasible design (unless you want to change
> the kernel in this respect).

I think the sane usage of Xenomai is to rely on Xenomai services, not to
use __real_pthread_setschedparam to set the Linux side priority of a
real-time thread.

-- 
 Gilles.

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] Pending patches

2009-01-15 Thread Gilles Chanteperdrix
Jan Kiszka wrote:
> Gilles Chanteperdrix wrote:
>> Gilles Chanteperdrix wrote:
>>> Jan Kiszka wrote:
 Hi,

 currently I have the following six patches in my assorted queue
 (git://git.kiszka.org/xenomai.git queue/assorted). All have been posted
 before, I just rebased them since then a few times. Should I repost
 any/all of them (would be no problem), or are some already queued for
 potential merge?

 Jan

 (...)
 commit a631ab2c531d5e381ba8a0a59bf301a0276d9f99
 Author: Jan Kiszka 
 Date:   Thu Jan 15 11:10:24 2009 +0100

 POSIX: Do not auto-shadow main with dlopen enabled
 
 Don't perform auto-shadowing in POSIX skin if we might be loaded via
 dlopen. Otherwise the wrong thread, the undefined dlopen caller, may be
 (re-)shadowed, assigning wrong scheduling settings.
 
 Signed-off-by: Jan Kiszka 

  src/skins/posix/init.c |   43 ---
  1 files changed, 28 insertions(+), 15 deletions(-)

 commit 91ae3da822ca558804bf33be4d164ea4c2667c1b
 Author: Jan Kiszka 
 Date:   Thu Jan 15 11:10:24 2009 +0100

 Replace --without-__tread with --enable-dlopen-skins
 
 In practice, you only want to disable __thread support when Xenomai 
 skin
 libraries should be loadable via dlopen. Therefore rename the related
 configure switch accordingly.
 
 Signed-off-by: Jan Kiszka 

>>> Nack these two ones: one of the gcc bugs I have found on ARM is related
>>> to __thread (http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38815). So,
>>> even though I have found a work around for the bug, I am not sure that
>>> it works all the time, so disabling __thread on ARM is safer. Especially
>>> since __thread does not bring any performance improvement over
>>> pthread_(get|set)specific on ARM.
>> Actually, I am not opposed to the first, only to the second.
> 
> OK, will you or should I apply the do-not-auto-shadow patch?
> 
> Regarding the second one: If the option should continue to be named
> --without-__thread (because of ARM), then we need at the bare minimum a
> documentation of the correlation with dlopen. BTW, any concerns to apply
> the "Mark libs nodlopen" fix?

Maybe we could add the dlopen option which would disable shadowing of
the main thread and unselect __thread ?

-- 
 Gilles.

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] Pending patches

2009-01-15 Thread Philippe Gerum
Jan Kiszka wrote:
> Philippe Gerum wrote:
>> Jan Kiszka wrote:
>>> commit 728fc8970e2032b3280971788f1223f3ad82d80d
>>> Author: Jan Kiszka 
>>> Date:   Thu Jan 15 11:10:24 2009 +0100
>>>
>>> xnpipe: Fix racy callback handlers
>>> 
>>> Invocation of input, output and alloc handler must take place under
>>> nklock to properly synchronize with xnpipe_disconnect. Change all
>>> callers to comply with this policy.
>>>
>> That one is under investigation. I agree on the bug report (thanks btw), but 
>> I
>> disagree on the fix. Basically, we can't run all hooks under nklock. For
>> instance, the alloc_handler may issue kmalloc() calls when issued from the 
>> Linux
>> write endpoint.
> 
> You mean it /could/? Because no in-tree user (ie. native) calls
> rt-unsafe services from its alloc_handler.
>

When you export a public interface, it is better not to make it incompatible
unless there is no other way to fix a situation. Doing so is last resort for me.

> Jan
> 


-- 
Philippe.

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] Pending patches

2009-01-15 Thread Jan Kiszka
Gilles Chanteperdrix wrote:
> Gilles Chanteperdrix wrote:
>> Jan Kiszka wrote:
>>> Hi,
>>>
>>> currently I have the following six patches in my assorted queue
>>> (git://git.kiszka.org/xenomai.git queue/assorted). All have been posted
>>> before, I just rebased them since then a few times. Should I repost
>>> any/all of them (would be no problem), or are some already queued for
>>> potential merge?
>>>
>>> Jan
>>>
>>> (...)
>>> commit a631ab2c531d5e381ba8a0a59bf301a0276d9f99
>>> Author: Jan Kiszka 
>>> Date:   Thu Jan 15 11:10:24 2009 +0100
>>>
>>> POSIX: Do not auto-shadow main with dlopen enabled
>>> 
>>> Don't perform auto-shadowing in POSIX skin if we might be loaded via
>>> dlopen. Otherwise the wrong thread, the undefined dlopen caller, may be
>>> (re-)shadowed, assigning wrong scheduling settings.
>>> 
>>> Signed-off-by: Jan Kiszka 
>>>
>>>  src/skins/posix/init.c |   43 ---
>>>  1 files changed, 28 insertions(+), 15 deletions(-)
>>>
>>> commit 91ae3da822ca558804bf33be4d164ea4c2667c1b
>>> Author: Jan Kiszka 
>>> Date:   Thu Jan 15 11:10:24 2009 +0100
>>>
>>> Replace --without-__tread with --enable-dlopen-skins
>>> 
>>> In practice, you only want to disable __thread support when Xenomai skin
>>> libraries should be loadable via dlopen. Therefore rename the related
>>> configure switch accordingly.
>>> 
>>> Signed-off-by: Jan Kiszka 
>>>
>> Nack these two ones: one of the gcc bugs I have found on ARM is related
>> to __thread (http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38815). So,
>> even though I have found a work around for the bug, I am not sure that
>> it works all the time, so disabling __thread on ARM is safer. Especially
>> since __thread does not bring any performance improvement over
>> pthread_(get|set)specific on ARM.
> 
> Actually, I am not opposed to the first, only to the second.

OK, will you or should I apply the do-not-auto-shadow patch?

Regarding the second one: If the option should continue to be named
--without-__thread (because of ARM), then we need at the bare minimum a
documentation of the correlation with dlopen. BTW, any concerns to apply
the "Mark libs nodlopen" fix?

Jan

-- 
Siemens AG, Corporate Technology, CT SE 26
Corporate Competence Center Embedded Linux

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] Pending patches

2009-01-15 Thread Jan Kiszka
Gilles Chanteperdrix wrote:
> Jan Kiszka wrote:
>> Hi,
>>
>> currently I have the following six patches in my assorted queue
>> (git://git.kiszka.org/xenomai.git queue/assorted). All have been posted
>> before, I just rebased them since then a few times. Should I repost
>> any/all of them (would be no problem), or are some already queued for
>> potential merge?
>>
>> Jan
>>
>> (...)
>> commit a631ab2c531d5e381ba8a0a59bf301a0276d9f99
>> Author: Jan Kiszka 
>> Date:   Thu Jan 15 11:10:24 2009 +0100
>>
>> POSIX: Do not auto-shadow main with dlopen enabled
>> 
>> Don't perform auto-shadowing in POSIX skin if we might be loaded via
>> dlopen. Otherwise the wrong thread, the undefined dlopen caller, may be
>> (re-)shadowed, assigning wrong scheduling settings.
>> 
>> Signed-off-by: Jan Kiszka 
>>
>>  src/skins/posix/init.c |   43 ---
>>  1 files changed, 28 insertions(+), 15 deletions(-)
>>
>> commit 91ae3da822ca558804bf33be4d164ea4c2667c1b
>> Author: Jan Kiszka 
>> Date:   Thu Jan 15 11:10:24 2009 +0100
>>
>> Replace --without-__tread with --enable-dlopen-skins
>> 
>> In practice, you only want to disable __thread support when Xenomai skin
>> libraries should be loadable via dlopen. Therefore rename the related
>> configure switch accordingly.
>> 
>> Signed-off-by: Jan Kiszka 
>>
> 
> Nack these two ones: one of the gcc bugs I have found on ARM is related
> to __thread (http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38815). So,
> even though I have found a work around for the bug, I am not sure that
> it works all the time, so disabling __thread on ARM is safer. Especially
> since __thread does not bring any performance improvement over
> pthread_(get|set)specific on ARM.
> 
>> commit 028d4766a38b6937d9a2c02a20022e3ee5b67b55
>> Author: Jan Kiszka 
>> Date:   Thu Jan 15 11:10:24 2009 +0100
>>
>> POSIX: Fix initialization of SCHED_RR threads
>> 
>> Passing SCHED_RR as policy to pthread_create has currently not the
>> desired effect. The kernel part expects that user space adjusts the
>> policy and prio via __pse51_thread_setschedparam after setting up the
>> shadow. And this is what the patch does by calling the wrapped
>> pthread_setschedparam instead of the real one.
>> 
>> Signed-off-by: Jan Kiszka 
>>
>>  src/skins/posix/thread.c |2 +-
>>  1 files changed, 1 insertions(+), 1 deletions(-)
>>
>> commit 71666ce04ef216d281fe86ee82a5560c2b57c6dd
>> Author: Jan Kiszka 
>> Date:   Thu Jan 15 11:10:24 2009 +0100
>>
>> Handle priority changes of SCHED_RR tasks
>> 
>> If shadowed Linux tasks with SCHED_RR policy change their priority,
>> do_setsched_event currenty ignores this. Extend the condition to catch
>> this case as well.
>> 
>> Signed-off-by: Jan Kiszka 
>>
>>  ksrc/nucleus/shadow.c |2 +-
>>  1 files changed, 1 insertions(+), 1 deletions(-)
>>
> 
> Nack these two ones too. Philippe implemented a SCHED_RR working over
> aperiodic mode. I think the POSIX skin needs fixing, but not that way.

Then please suggest a better fix.

> We should have the thread run with SCHED_FIFO in secondary mode even if
> it runs under round-robin in primary mode.

As I explained, that can always be a weak policy. No one can prevent the
user from setting his Linux thread to SCHED_RR.

> And when we have done that,
> we do not need the second patch, since a shadow linux task will always
> run with SCHED_FIFO.
> 

Only by convention, not by feasible design (unless you want to change
the kernel in this respect).

Jan

-- 
Siemens AG, Corporate Technology, CT SE 26
Corporate Competence Center Embedded Linux

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] Pending patches

2009-01-15 Thread Jan Kiszka
Philippe Gerum wrote:
> Jan Kiszka wrote:
>> commit 728fc8970e2032b3280971788f1223f3ad82d80d
>> Author: Jan Kiszka 
>> Date:   Thu Jan 15 11:10:24 2009 +0100
>>
>> xnpipe: Fix racy callback handlers
>> 
>> Invocation of input, output and alloc handler must take place under
>> nklock to properly synchronize with xnpipe_disconnect. Change all
>> callers to comply with this policy.
>>
> 
> That one is under investigation. I agree on the bug report (thanks btw), but I
> disagree on the fix. Basically, we can't run all hooks under nklock. For
> instance, the alloc_handler may issue kmalloc() calls when issued from the 
> Linux
> write endpoint.

You mean it /could/? Because no in-tree user (ie. native) calls
rt-unsafe services from its alloc_handler.

Jan

-- 
Siemens AG, Corporate Technology, CT SE 26
Corporate Competence Center Embedded Linux

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] Pending patches

2009-01-15 Thread Gilles Chanteperdrix
Gilles Chanteperdrix wrote:
> Jan Kiszka wrote:
>> Hi,
>>
>> currently I have the following six patches in my assorted queue
>> (git://git.kiszka.org/xenomai.git queue/assorted). All have been posted
>> before, I just rebased them since then a few times. Should I repost
>> any/all of them (would be no problem), or are some already queued for
>> potential merge?
>>
>> Jan
>>
>> (...)
>> commit a631ab2c531d5e381ba8a0a59bf301a0276d9f99
>> Author: Jan Kiszka 
>> Date:   Thu Jan 15 11:10:24 2009 +0100
>>
>> POSIX: Do not auto-shadow main with dlopen enabled
>> 
>> Don't perform auto-shadowing in POSIX skin if we might be loaded via
>> dlopen. Otherwise the wrong thread, the undefined dlopen caller, may be
>> (re-)shadowed, assigning wrong scheduling settings.
>> 
>> Signed-off-by: Jan Kiszka 
>>
>>  src/skins/posix/init.c |   43 ---
>>  1 files changed, 28 insertions(+), 15 deletions(-)
>>
>> commit 91ae3da822ca558804bf33be4d164ea4c2667c1b
>> Author: Jan Kiszka 
>> Date:   Thu Jan 15 11:10:24 2009 +0100
>>
>> Replace --without-__tread with --enable-dlopen-skins
>> 
>> In practice, you only want to disable __thread support when Xenomai skin
>> libraries should be loadable via dlopen. Therefore rename the related
>> configure switch accordingly.
>> 
>> Signed-off-by: Jan Kiszka 
>>
> 
> Nack these two ones: one of the gcc bugs I have found on ARM is related
> to __thread (http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38815). So,
> even though I have found a work around for the bug, I am not sure that
> it works all the time, so disabling __thread on ARM is safer. Especially
> since __thread does not bring any performance improvement over
> pthread_(get|set)specific on ARM.

Actually, I am not opposed to the first, only to the second.

-- 
 Gilles.

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] Pending patches

2009-01-15 Thread Philippe Gerum
Jan Kiszka wrote:
> Hi,
> 
> currently I have the following six patches in my assorted queue
> (git://git.kiszka.org/xenomai.git queue/assorted). All have been posted
> before, I just rebased them since then a few times. Should I repost
> any/all of them (would be no problem), or are some already queued for
> potential merge?
> 
> Jan
> 
> 
> 
> commit 728fc8970e2032b3280971788f1223f3ad82d80d
> Author: Jan Kiszka 
> Date:   Thu Jan 15 11:10:24 2009 +0100
> 
> xnpipe: Fix racy callback handlers
> 
> Invocation of input, output and alloc handler must take place under
> nklock to properly synchronize with xnpipe_disconnect. Change all
> callers to comply with this policy.
>

That one is under investigation. I agree on the bug report (thanks btw), but I
disagree on the fix. Basically, we can't run all hooks under nklock. For
instance, the alloc_handler may issue kmalloc() calls when issued from the Linux
write endpoint.

However, I do agree that the current situation is a terrible mess, and that we
direly need a fix. In the same move, the semantics of some handlers (namely the
input one) should be sanitized. IOW, a global rework of that particular area is
required.

> Signed-off-by: Jan Kiszka 
> 
>  ksrc/nucleus/pipe.c |   96 ++
>  1 files changed, 42 insertions(+), 54 deletions(-)
> 
> commit a631ab2c531d5e381ba8a0a59bf301a0276d9f99
> Author: Jan Kiszka 
> Date:   Thu Jan 15 11:10:24 2009 +0100
> 
> POSIX: Do not auto-shadow main with dlopen enabled
> 
> Don't perform auto-shadowing in POSIX skin if we might be loaded via
> dlopen. Otherwise the wrong thread, the undefined dlopen caller, may be
> (re-)shadowed, assigning wrong scheduling settings.
> 

No objection on this one. It's Gilles's call.

> Signed-off-by: Jan Kiszka 
> 
>  src/skins/posix/init.c |   43 ---
>  1 files changed, 28 insertions(+), 15 deletions(-)
> 
> commit 91ae3da822ca558804bf33be4d164ea4c2667c1b
> Author: Jan Kiszka 
> Date:   Thu Jan 15 11:10:24 2009 +0100
> 
> Replace --without-__tread with --enable-dlopen-skins
> 
> In practice, you only want to disable __thread support when Xenomai skin
> libraries should be loadable via dlopen. Therefore rename the related
> configure switch accordingly.
> 
> Signed-off-by: Jan Kiszka 
> 
>  configure.in |   19 +--
>  1 files changed, 13 insertions(+), 6 deletions(-)
> 
> commit 2af3cfcbcb21591089ed33da9e6efb0b5f78a71b
> Author: Jan Kiszka 
> Date:   Thu Jan 15 11:10:24 2009 +0100
> 
> Mark libs nodlopen on initial-exec TLS
> 
> Mark libs with nodlopen if initial-exec __thread variables are used
> because dlopen and this TLS model are in conflict.
> 
> Signed-off-by: Jan Kiszka 
> 
>  configure.in  |3 +++
>  src/skins/native/Makefile.am  |2 +-
>  src/skins/posix/Makefile.am   |2 +-
>  src/skins/psos+/Makefile.am   |2 +-
>  src/skins/rtai/Makefile.am|2 +-
>  src/skins/rtdm/Makefile.am|2 +-
>  src/skins/uitron/Makefile.am  |2 +-
>  src/skins/vrtx/Makefile.am|2 +-
>  src/skins/vxworks/Makefile.am |2 +-
>  9 files changed, 11 insertions(+), 8 deletions(-)
> 
> commit 028d4766a38b6937d9a2c02a20022e3ee5b67b55
> Author: Jan Kiszka 
> Date:   Thu Jan 15 11:10:24 2009 +0100
> 
> POSIX: Fix initialization of SCHED_RR threads
> 
> Passing SCHED_RR as policy to pthread_create has currently not the
> desired effect. The kernel part expects that user space adjusts the
> policy and prio via __pse51_thread_setschedparam after setting up the
> shadow. And this is what the patch does by calling the wrapped
> pthread_setschedparam instead of the real one.
> 
> Signed-off-by: Jan Kiszka 
> 
>  src/skins/posix/thread.c |2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> commit 71666ce04ef216d281fe86ee82a5560c2b57c6dd
> Author: Jan Kiszka 
> Date:   Thu Jan 15 11:10:24 2009 +0100
> 
> Handle priority changes of SCHED_RR tasks
> 
> If shadowed Linux tasks with SCHED_RR policy change their priority,
> do_setsched_event currenty ignores this. Extend the condition to catch
> this case as well.
> 
> Signed-off-by: Jan Kiszka 
> 
>  ksrc/nucleus/shadow.c |2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 


-- 
Philippe.

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] Pending patches

2009-01-15 Thread Gilles Chanteperdrix
Jan Kiszka wrote:
> Hi,
> 
> currently I have the following six patches in my assorted queue
> (git://git.kiszka.org/xenomai.git queue/assorted). All have been posted
> before, I just rebased them since then a few times. Should I repost
> any/all of them (would be no problem), or are some already queued for
> potential merge?
> 
> Jan
> 
> (...)
> commit a631ab2c531d5e381ba8a0a59bf301a0276d9f99
> Author: Jan Kiszka 
> Date:   Thu Jan 15 11:10:24 2009 +0100
> 
> POSIX: Do not auto-shadow main with dlopen enabled
> 
> Don't perform auto-shadowing in POSIX skin if we might be loaded via
> dlopen. Otherwise the wrong thread, the undefined dlopen caller, may be
> (re-)shadowed, assigning wrong scheduling settings.
> 
> Signed-off-by: Jan Kiszka 
> 
>  src/skins/posix/init.c |   43 ---
>  1 files changed, 28 insertions(+), 15 deletions(-)
> 
> commit 91ae3da822ca558804bf33be4d164ea4c2667c1b
> Author: Jan Kiszka 
> Date:   Thu Jan 15 11:10:24 2009 +0100
> 
> Replace --without-__tread with --enable-dlopen-skins
> 
> In practice, you only want to disable __thread support when Xenomai skin
> libraries should be loadable via dlopen. Therefore rename the related
> configure switch accordingly.
> 
> Signed-off-by: Jan Kiszka 
> 

Nack these two ones: one of the gcc bugs I have found on ARM is related
to __thread (http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38815). So,
even though I have found a work around for the bug, I am not sure that
it works all the time, so disabling __thread on ARM is safer. Especially
since __thread does not bring any performance improvement over
pthread_(get|set)specific on ARM.

> commit 028d4766a38b6937d9a2c02a20022e3ee5b67b55
> Author: Jan Kiszka 
> Date:   Thu Jan 15 11:10:24 2009 +0100
> 
> POSIX: Fix initialization of SCHED_RR threads
> 
> Passing SCHED_RR as policy to pthread_create has currently not the
> desired effect. The kernel part expects that user space adjusts the
> policy and prio via __pse51_thread_setschedparam after setting up the
> shadow. And this is what the patch does by calling the wrapped
> pthread_setschedparam instead of the real one.
> 
> Signed-off-by: Jan Kiszka 
> 
>  src/skins/posix/thread.c |2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> commit 71666ce04ef216d281fe86ee82a5560c2b57c6dd
> Author: Jan Kiszka 
> Date:   Thu Jan 15 11:10:24 2009 +0100
> 
> Handle priority changes of SCHED_RR tasks
> 
> If shadowed Linux tasks with SCHED_RR policy change their priority,
> do_setsched_event currenty ignores this. Extend the condition to catch
> this case as well.
> 
> Signed-off-by: Jan Kiszka 
> 
>  ksrc/nucleus/shadow.c |2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 

Nack these two ones too. Philippe implemented a SCHED_RR working over
aperiodic mode. I think the POSIX skin needs fixing, but not that way.
We should have the thread run with SCHED_FIFO in secondary mode even if
it runs under round-robin in primary mode. And when we have done that,
we do not need the second patch, since a shadow linux task will always
run with SCHED_FIFO.

-- 
 Gilles.

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core