Re: [Xenomai-core] [PULL] rt-puts fix, mprotect testsuite

2012-04-04 Thread Jan Kiszka
On 2012-04-04 15:23, Gilles Chanteperdrix wrote:
> On 04/04/2012 03:12 PM, Jan Kiszka wrote:
>> On 2012-04-04 15:02, Gilles Chanteperdrix wrote:
>>> On 04/04/2012 02:56 PM, Jan Kiszka wrote:
>>>> The following changes since commit 
>>>> 0ef2410a2c9cf7102dead861241bd2d9957e4433:
>>>>
>>>>   Mask signals in rt_print:printer_loop() (2012-04-02 00:16:41 +0200)
>>>>
>>>> are available in the git repository at:
>>>>   git://git.xenomai.org/xenomai-jki.git for-upstream
>>>>
>>>> Jan Kiszka (3):
>>>>   Append missing newline to rt_[f]puts output
>>>>   testsuite: Add rt-print buffer flushes to native error paths
>>>
>>> As I said, I do not agree with calling rt_print_flush_buffers outside of
>>> xenomai libs.
>>
>> rt_print_flush_buffers is a Xenomai API function that we export for
>> quite a while now. The rt_* functions are about explicit control when
>> what is invoked, both in the native skin and in what used to be rtdk.
>> Also, you can't avoid this function when you interact with libraries
>> that are unwrapped.
>>
>> That said, I can fix that minor issue in leaks differently if you insist.
> 
> I added rt_print_flush_buffers for xenomai 2.6.0, in order to implement
> systematic wrapping of printf by the posix skin, my intent was not to
> really export it. From my point of view, having to call this flush
> function all over the place reveals a problem in the application. If you
> use always printf or always rt_printf, you do not need to call this
> function.

Still, there are those use cases where you cannot replace the print
functions. So it remains useful.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] [PULL] rt-puts fix, mprotect testsuite

2012-04-04 Thread Jan Kiszka
On 2012-04-04 15:02, Gilles Chanteperdrix wrote:
> On 04/04/2012 02:56 PM, Jan Kiszka wrote:
>> The following changes since commit 0ef2410a2c9cf7102dead861241bd2d9957e4433:
>>
>>   Mask signals in rt_print:printer_loop() (2012-04-02 00:16:41 +0200)
>>
>> are available in the git repository at:
>>   git://git.xenomai.org/xenomai-jki.git for-upstream
>>
>> Jan Kiszka (3):
>>   Append missing newline to rt_[f]puts output
>>   testsuite: Add rt-print buffer flushes to native error paths
> 
> As I said, I do not agree with calling rt_print_flush_buffers outside of
> xenomai libs.

rt_print_flush_buffers is a Xenomai API function that we export for
quite a while now. The rt_* functions are about explicit control when
what is invoked, both in the native skin and in what used to be rtdk.
Also, you can't avoid this function when you interact with libraries
that are unwrapped.

That said, I can fix that minor issue in leaks differently if you insist.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


[Xenomai-core] [PULL] rt-puts fix, mprotect testsuite

2012-04-04 Thread Jan Kiszka
The following changes since commit 0ef2410a2c9cf7102dead861241bd2d9957e4433:

  Mask signals in rt_print:printer_loop() (2012-04-02 00:16:41 +0200)

are available in the git repository at:
  git://git.xenomai.org/xenomai-jki.git for-upstream

Jan Kiszka (3):
  Append missing newline to rt_[f]puts output
  testsuite: Add rt-print buffer flushes to native error paths
  Add regression test for mprotect on pinned memory

 src/skins/common/rt_print.c|7 ++-
 src/testsuite/regression/native/check.h|2 +
 src/testsuite/regression/native/leaks.c|1 +
 src/testsuite/regression/posix/Makefile.am |2 +-
 src/testsuite/regression/posix/Makefile.in |   15 -
 src/testsuite/regression/posix/check.h |   10 +++
 src/testsuite/regression/posix/mprotect.c  |   97 
 7 files changed, 129 insertions(+), 5 deletions(-)
 create mode 100644 src/testsuite/regression/posix/mprotect.c

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : Add regression test for mprotect on pinned memory

2012-04-03 Thread Jan Kiszka
On 2012-04-02 16:35, Gilles Chanteperdrix wrote:
> On 04/02/2012 04:09 PM, GIT version control wrote:
>> Module: xenomai-jki
>> Branch: for-upstream
>> Commit: 410e90d085d21dc913f8724efafe6ae75bd3c952
>> URL:
>> http://git.xenomai.org/?p=xenomai-jki.git;a=commit;h=410e90d085d21dc913f8724efafe6ae75bd3c952
>>
>> Author: Jan Kiszka 
>> Date:   Fri Mar 30 18:06:27 2012 +0200
>>
>> Add regression test for mprotect on pinned memory
>>
>> This tests both the original issue of mprotect reintroducing COW pages
>> to Xenomai processes as well as the recently fixed zero page corruption.
>>
>> Signed-off-by: Jan Kiszka 
>> +static void check_inner(const char *fn, int line, const char *msg,
>> +int status, int expected)
>> +{
>> +if (status == expected)
>> +return;
>> +
>> +rt_task_set_mode(T_WARNSW, 0, NULL);
>> +rt_print_flush_buffers();
>> (...)
>> +static void check_value_inner(const char *fn, int line, const char *msg,
>> +  int value, int expected)
>> +{
>> +if (value == expected)
>> +return;
>> +
>> +rt_task_set_mode(T_WARNSW, 0, NULL);
>> +rt_print_flush_buffers();
>> (...)
>> +void sigdebug_handler(int sig, siginfo_t *si, void *context)
>> +{
>> +unsigned int reason = si->si_value.sival_int;
>> +
>> +rt_print_flush_buffers();
>> (...)
>> +
>> +rt_task_set_mode(T_WARNSW, 0, NULL);
>> +rt_print_flush_buffers();
> 
> Maybe you could use posix skin's printf instead of putting calls to
> rt_print_flush_buffers all over the place? I did not mean for this call
> to be exported, I only added it for internal use by the posix skin.
> 

Could be done, likely together with a complete switch to posix.

I could also start to use the check_* wrappers that I just discovered.
BTW, the native version lacks that flush unless it's used in
native+posix context. I will write a fix.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


[Xenomai-core] [PATCH 2.6] Add regression test for mprotect on pinned memory

2012-04-02 Thread Jan Kiszka
This tests both the original issue of mprotect reintroducing COW pages
to Xenomai processes as well as the recently fixed zero page corruption.

Signed-off-by: Jan Kiszka 
---
 src/testsuite/regression/native/Makefile.am |2 +-
 src/testsuite/regression/native/Makefile.in |   16 +++-
 src/testsuite/regression/native/mprotect.c  |  128 +++
 3 files changed, 142 insertions(+), 4 deletions(-)
 create mode 100644 src/testsuite/regression/native/mprotect.c

diff --git a/src/testsuite/regression/native/Makefile.am 
b/src/testsuite/regression/native/Makefile.am
index ce53834..afd2f43 100644
--- a/src/testsuite/regression/native/Makefile.am
+++ b/src/testsuite/regression/native/Makefile.am
@@ -2,7 +2,7 @@ tstdir = @XENO_TEST_DIR@/regression/native
 
 noinst_HEADERS = check.h
 
-tst_PROGRAMS = leaks tsc heap sigdebug
+tst_PROGRAMS = leaks tsc heap sigdebug mprotect
 
 CPPFLAGS = $(XENO_USER_CFLAGS) \
-I$(top_srcdir)/include
diff --git a/src/testsuite/regression/native/Makefile.in 
b/src/testsuite/regression/native/Makefile.in
index 3d5a0ce..b2ae93a 100644
--- a/src/testsuite/regression/native/Makefile.in
+++ b/src/testsuite/regression/native/Makefile.in
@@ -37,7 +37,7 @@ build_triplet = @build@
 host_triplet = @host@
 target_triplet = @target@
 tst_PROGRAMS = leaks$(EXEEXT) tsc$(EXEEXT) heap$(EXEEXT) \
-   sigdebug$(EXEEXT)
+   sigdebug$(EXEEXT) mprotect$(EXEEXT)
 subdir = src/testsuite/regression/native
 DIST_COMMON = $(noinst_HEADERS) $(srcdir)/Makefile.am \
$(srcdir)/Makefile.in
@@ -70,6 +70,12 @@ leaks_LDADD = $(LDADD)
 leaks_DEPENDENCIES = ../../../skins/native/libnative.la \
../../../skins/rtdm/librtdm.la \
../../../skins/common/libxenomai.la
+mprotect_SOURCES = mprotect.c
+mprotect_OBJECTS = mprotect.$(OBJEXT)
+mprotect_LDADD = $(LDADD)
+mprotect_DEPENDENCIES = ../../../skins/native/libnative.la \
+   ../../../skins/rtdm/librtdm.la \
+   ../../../skins/common/libxenomai.la
 sigdebug_SOURCES = sigdebug.c
 sigdebug_OBJECTS = sigdebug.$(OBJEXT)
 sigdebug_LDADD = $(LDADD)
@@ -95,8 +101,8 @@ CCLD = $(CC)
 LINK = $(LIBTOOL) --tag=CC $(AM_LIBTOOLFLAGS) $(LIBTOOLFLAGS) \
--mode=link $(CCLD) $(AM_CFLAGS) $(CFLAGS) $(AM_LDFLAGS) \
$(LDFLAGS) -o $@
-SOURCES = heap.c leaks.c sigdebug.c tsc.c
-DIST_SOURCES = heap.c leaks.c sigdebug.c tsc.c
+SOURCES = heap.c leaks.c mprotect.c sigdebug.c tsc.c
+DIST_SOURCES = heap.c leaks.c mprotect.c sigdebug.c tsc.c
 HEADERS = $(noinst_HEADERS)
 ETAGS = etags
 CTAGS = ctags
@@ -348,6 +354,9 @@ heap$(EXEEXT): $(heap_OBJECTS) $(heap_DEPENDENCIES)
 leaks$(EXEEXT): $(leaks_OBJECTS) $(leaks_DEPENDENCIES) 
@rm -f leaks$(EXEEXT)
$(LINK) $(leaks_OBJECTS) $(leaks_LDADD) $(LIBS)
+mprotect$(EXEEXT): $(mprotect_OBJECTS) $(mprotect_DEPENDENCIES) 
+   @rm -f mprotect$(EXEEXT)
+   $(LINK) $(mprotect_OBJECTS) $(mprotect_LDADD) $(LIBS)
 sigdebug$(EXEEXT): $(sigdebug_OBJECTS) $(sigdebug_DEPENDENCIES) 
@rm -f sigdebug$(EXEEXT)
$(LINK) $(sigdebug_OBJECTS) $(sigdebug_LDADD) $(LIBS)
@@ -363,6 +372,7 @@ distclean-compile:
 
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/heap.Po@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/leaks.Po@am__quote@
+@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/mprotect.Po@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/sigdebug.Po@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/tsc.Po@am__quote@
 
diff --git a/src/testsuite/regression/native/mprotect.c 
b/src/testsuite/regression/native/mprotect.c
new file mode 100644
index 000..7d66e34
--- /dev/null
+++ b/src/testsuite/regression/native/mprotect.c
@@ -0,0 +1,128 @@
+/*
+ * Test if implicit pinning of memory via mprotect works.
+ *
+ * Copyright (C) Jan Kiszka  
+ *
+ * Released under the terms of GPLv2.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define MEMSIZE 0x1
+
+static void check_inner(const char *fn, int line, const char *msg,
+   int status, int expected)
+{
+   if (status == expected)
+   return;
+
+   rt_task_set_mode(T_WARNSW, 0, NULL);
+   rt_print_flush_buffers();
+   fprintf(stderr, "FAILURE %s:%d: %s returned %d instead of %d - %s\n",
+   fn, line, msg, status, expected, strerror(-status));
+   exit(EXIT_FAILURE);
+}
+
+#define check_no_error(msg, status) ({ \
+   int __status = status;  \
+   check_inner(__FUNCTION__, __LINE__, msg,\
+   __status < 0 ? __status : 0, 0);\
+   __status;   \
+})
+
+static void check_value_inner(const char *fn, int line, const char *msg,
+ int value, int expected)
+{
+   if (value == expected)
+   return;
+
+   rt_task_set_mode(T_WA

Re: [Xenomai-core] Using xenomai with standard ethernet driver

2012-03-23 Thread Jan Kiszka
On 2012-03-23 12:18, Roberto Bielli wrote:
> Hi Jan,
> 
> i want to send and receive custom raw packets over ethernet.
> is it possibile with RTnet or it sends only rtnet packets ?
> 
> Thanks for your response.

RTnet is modular, and if you use the stack a sketched, you are free to
define the payload of the sent Ethernet frames on your own.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] Using xenomai with standard ethernet driver

2012-03-23 Thread Jan Kiszka
On 2012-03-22 17:37, Roberto Bielli wrote:
> I explain better the problem.
> 
> I want to use ethernet driver raw without using all the abstract layer 
> (tcp, socket etc)
> and my xenomai application must link to the raw driver without enter in 
> secondary mode.
> 
> Have i change the driver interface with xenomai calls (rtdm_..) or 
> is there another chance ?

Did you check if RTnet already has support for your NIC? Then you are
fine with a stack of rtnet.ko, rtpacket.ko, and .ko.

If not, there are two options
 - add an RTnet driver for you hardware
 - implement some home-brewed stack

The second option may only look different on first sight. But,
specifically if your application is not kernel-hosted but a proper
userspace app, RTnet provides quite a few useful building blocks for
your scenario.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] Using xenomai with standard ethernet driver

2012-03-22 Thread Jan Kiszka
On 2012-03-21 17:48, Roberto Bielli wrote:
> Hi,
> 
> i want to use xenomai for sending standard tcp/ip packets.
> Is it possible to rewrite the ethernet driver for xenomai without using
> rtnet ?
> Or if i use rtnet is there a way to send standard tcp packet instead
> rtnet packets ?

Is the TCP/IP communication itself part of your time-critical path, or
could it run with lower priority as well?

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] Built-in libxenomai dependency

2012-02-07 Thread Jan Kiszka
On 2012-02-07 17:28, Gilles Chanteperdrix wrote:
> This causes a -L/usr/lib to be added on the link-edit command line,
> which causes the link to fail by finding /usr/lib/libpthread.so instead
> of the cross-compiler one, and fail.

How does libxenomai.la look like? Is it different from a native build?

Hmm, there must be some valid way to express library dependencies inside
you package.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] Built-in libxenomai dependency

2012-02-07 Thread Jan Kiszka
On 2012-02-07 17:19, Gilles Chanteperdrix wrote:
> On 02/01/2012 09:21 PM, Gilles Chanteperdrix wrote:
>> On 02/01/2012 04:50 PM, Jan Kiszka wrote:
>>> On 2012-02-01 16:38, Gilles Chanteperdrix wrote:
>>>> On 02/01/2012 04:25 PM, Jan Kiszka wrote:
>>>>> On 2012-02-01 16:17, Gilles Chanteperdrix wrote:
>>>>>> On 02/01/2012 03:37 PM, Jan Kiszka wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> don't remember anymore: Is there any subtle reason that prevent a
>>>>>>> change like
>>>>>>>
>>>>>>> diff --git a/src/skins/native/Makefile.am b/src/skins/native/Makefile.am
>>>>>>> index 39eaaed..4cc8859 100644
>>>>>>> --- a/src/skins/native/Makefile.am
>>>>>>> +++ b/src/skins/native/Makefile.am
>>>>>>> @@ -22,6 +22,9 @@ libnative_la_SOURCES = \
>>>>>>> wrappers.c \
>>>>>>> wrappers.h
>>>>>>>  
>>>>>>> +libnative_la_LIBADD = \
>>>>>>> +   ../common/libxenomai.la
>>>>>>> +
>>>>>>>  libnative_la_CPPFLAGS = \
>>>>>>> @XENO_USER_CFLAGS@ \
>>>>>>> -I$(top_srcdir)/include
>>>>>>> diff --git a/src/skins/rtdm/Makefile.am b/src/skins/rtdm/Makefile.am
>>>>>>> index 8ad74be..2dc0a90 100644
>>>>>>> --- a/src/skins/rtdm/Makefile.am
>>>>>>> +++ b/src/skins/rtdm/Makefile.am
>>>>>>> @@ -8,6 +8,9 @@ librtdm_la_SOURCES = \
>>>>>>> core.c \
>>>>>>> init.c
>>>>>>>  
>>>>>>> +librtdm_la_LIBADD = \
>>>>>>> +   ../common/libxenomai.la
>>>>>>> +
>>>>>>>  librtdm_la_CPPFLAGS = \
>>>>>>> @XENO_USER_CFLAGS@ \
>>>>>>> -I$(top_srcdir)/include
>>>>>>> diff --git a/src/testsuite/latency/Makefile.am 
>>>>>>> b/src/testsuite/latency/Makefile.am
>>>>>>> index c772c26..6534df5 100644
>>>>>>> --- a/src/testsuite/latency/Makefile.am
>>>>>>> +++ b/src/testsuite/latency/Makefile.am
>>>>>>> @@ -14,5 +14,4 @@ latency_LDFLAGS = @XENO_USER_LDFLAGS@
>>>>>>>  latency_LDADD = \
>>>>>>> ../../skins/native/libnative.la \
>>>>>>> ../../skins/rtdm/librtdm.la \
>>>>>>> -   ../../skins/common/libxenomai.la \
>>>>>>> -lpthread -lm
>>>>>>>
>>>>>>> i.e. that we let the skin libraries depend on libxenomai and then remove
>>>>>>> the explicit dependency from our binaries and the xeno-config output?
>>>>>>> Is there some ordering issue again (we have -l before -lxenomai
>>>>>>> in the ldflags).
>>>>>>>
>>>>>>> If possible, this would allow for things like dlopen("libnative.so").
>>>>>>
>>>>>> It allows xeno-config result to work both with dynamic and static
>>>>>> libraries. Static libraries have no dependency system, so, when linking
>>>>>> a program whith libnative.a for instance, without libtool, you still
>>>>>> have to link it with libxenomai.a.
>>>>>
>>>>> OK, part two could stay, but the dependencies should still be added to
>>>>> the skin libs - if possible.
>>>>>
>>>>>>
>>>>>> How come you can not dlopen libnative.so, dlopening libxenomai.so before
>>>>>> does not work?
>>>>>
>>>>> Dependencies of libnative on libxenomai are not resolved when you open
>>>>> the former even if the latter is already loaded. Maybe you can do this
>>>>> by pulling in all required symbols one by one manually, haven't tried
>>>>> yet. But that would at least be unhandy.
>>>>
>>>> What about RTLD_GLOBAL ?
>>>>
>>>
>>> Was possibly the reason, need to check back.
>>>
>>> Still, what prevents stating the existing dependency of lib on
>>> libxenomai? The dance above would than be obsolete.
>>
>> The change is merged. I took the chance to check that static build still
>> built.
>>
> 
> Unfortunately this change also causes a build failure when
> cross-compiling for ARM, so, change reverted, too.
> 

Indeed unfortunate. Any pointers to logs?

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] [PATCH] Only export required CFLAGS via xeno-config

2012-02-07 Thread Jan Kiszka
On 2012-02-07 17:18, Gilles Chanteperdrix wrote:
> On 02/03/2012 03:50 PM, Jan Kiszka wrote:
>> -Werror-implicit-function-declaration is not compatible with C++, and
>> also decisions about -Wall and -pipe should be left to the application.
>>
>> Signed-off-by: Jan Kiszka 
> 
> Had to revert this patch, it causes a build failure when cross-compiling
> (for ARM, I do not know if it matters).

-pipe? Or a weird compiler bug? -Wsomething can't make a difference.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


[Xenomai-core] [PATCH forge] Fix build for relative invocations of configure

2012-02-07 Thread Jan Kiszka
This fixes build setups like '../configure'.

Signed-off-by: Jan Kiszka 
---
 configure.in |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/configure.in b/configure.in
index c0a7d17..0bdced8 100644
--- a/configure.in
+++ b/configure.in
@@ -547,7 +547,7 @@ LD_FILE_OPTION=$ac_cv_ld_file_option
 AC_SUBST(LD_FILE_OPTION)
 
 if test x$rtcore_type = xcobalt; then
-   XENO_USER_CFLAGS="-I$srcdir/include/cobalt $XENO_USER_CFLAGS"
+   XENO_USER_CFLAGS="-I`cd $srcdir && pwd`/include/cobalt $XENO_USER_CFLAGS"
if [[ $ac_cv_ld_file_option = yes ]]; then
XENO_POSIX_WRAPPERS="-Wl,@`cd $srcdir && pwd`/lib/cobalt/posix.wrappers"
else
-- 
1.7.3.4

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


[Xenomai-core] [PATCH] Only export required CFLAGS via xeno-config

2012-02-03 Thread Jan Kiszka
-Werror-implicit-function-declaration is not compatible with C++, and
also decisions about -Wall and -pipe should be left to the application.

Signed-off-by: Jan Kiszka 
---
 configure.in |   12 ++--
 1 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/configure.in b/configure.in
index b2563ee..2185925 100644
--- a/configure.in
+++ b/configure.in
@@ -448,13 +448,13 @@ else
 fi
 AC_MSG_RESULT([done])
 
-dnl Common CFLAGS and LDFLAGS
-XENO_USER_CFLAGS="-D_GNU_SOURCE -D_REENTRANT -Wall 
-Werror-implicit-function-declaration -pipe"
-XENO_USER_LDFLAGS=
+dnl Exported CFLAGS and LDFLAGS, shared with internal flags
+XENO_USER_APP_CFLAGS="-D_GNU_SOURCE -D_REENTRANT"
+XENO_USER_APP_LDFLAGS=
 
-dnl Exported CFLAGS and LDFLAGS, may be enhanced per-arch below
-XENO_USER_APP_CFLAGS=$XENO_USER_CFLAGS
-XENO_USER_APP_LDFLAGS=$XENO_USER_LDFLAGS
+dnl Internal CFLAGS and LDFLAGS, may be enhanced per-arch below
+XENO_USER_CFLAGS="$XENO_USER_CFLAGS -Wall 
-Werror-implicit-function-declaration -pipe"
+XENO_USER_LDFLAGS="$XENO_USER_APP_LDFLAGS"
 
 case $XENO_TARGET_ARCH in
  x86)
-- 
1.7.3.4

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] Built-in libxenomai dependency

2012-02-01 Thread Jan Kiszka
On 2012-02-01 16:38, Gilles Chanteperdrix wrote:
> On 02/01/2012 04:25 PM, Jan Kiszka wrote:
>> On 2012-02-01 16:17, Gilles Chanteperdrix wrote:
>>> On 02/01/2012 03:37 PM, Jan Kiszka wrote:
>>>> Hi,
>>>>
>>>> don't remember anymore: Is there any subtle reason that prevent a
>>>> change like
>>>>
>>>> diff --git a/src/skins/native/Makefile.am b/src/skins/native/Makefile.am
>>>> index 39eaaed..4cc8859 100644
>>>> --- a/src/skins/native/Makefile.am
>>>> +++ b/src/skins/native/Makefile.am
>>>> @@ -22,6 +22,9 @@ libnative_la_SOURCES = \
>>>>wrappers.c \
>>>>wrappers.h
>>>>  
>>>> +libnative_la_LIBADD = \
>>>> +  ../common/libxenomai.la
>>>> +
>>>>  libnative_la_CPPFLAGS = \
>>>>@XENO_USER_CFLAGS@ \
>>>>-I$(top_srcdir)/include
>>>> diff --git a/src/skins/rtdm/Makefile.am b/src/skins/rtdm/Makefile.am
>>>> index 8ad74be..2dc0a90 100644
>>>> --- a/src/skins/rtdm/Makefile.am
>>>> +++ b/src/skins/rtdm/Makefile.am
>>>> @@ -8,6 +8,9 @@ librtdm_la_SOURCES = \
>>>>core.c \
>>>>init.c
>>>>  
>>>> +librtdm_la_LIBADD = \
>>>> +  ../common/libxenomai.la
>>>> +
>>>>  librtdm_la_CPPFLAGS = \
>>>>@XENO_USER_CFLAGS@ \
>>>>-I$(top_srcdir)/include
>>>> diff --git a/src/testsuite/latency/Makefile.am 
>>>> b/src/testsuite/latency/Makefile.am
>>>> index c772c26..6534df5 100644
>>>> --- a/src/testsuite/latency/Makefile.am
>>>> +++ b/src/testsuite/latency/Makefile.am
>>>> @@ -14,5 +14,4 @@ latency_LDFLAGS = @XENO_USER_LDFLAGS@
>>>>  latency_LDADD = \
>>>>../../skins/native/libnative.la \
>>>>../../skins/rtdm/librtdm.la \
>>>> -  ../../skins/common/libxenomai.la \
>>>>-lpthread -lm
>>>>
>>>> i.e. that we let the skin libraries depend on libxenomai and then remove
>>>> the explicit dependency from our binaries and the xeno-config output?
>>>> Is there some ordering issue again (we have -l before -lxenomai
>>>> in the ldflags).
>>>>
>>>> If possible, this would allow for things like dlopen("libnative.so").
>>>
>>> It allows xeno-config result to work both with dynamic and static
>>> libraries. Static libraries have no dependency system, so, when linking
>>> a program whith libnative.a for instance, without libtool, you still
>>> have to link it with libxenomai.a.
>>
>> OK, part two could stay, but the dependencies should still be added to
>> the skin libs - if possible.
>>
>>>
>>> How come you can not dlopen libnative.so, dlopening libxenomai.so before
>>> does not work?
>>
>> Dependencies of libnative on libxenomai are not resolved when you open
>> the former even if the latter is already loaded. Maybe you can do this
>> by pulling in all required symbols one by one manually, haven't tried
>> yet. But that would at least be unhandy.
> 
> What about RTLD_GLOBAL ?
> 

Was possibly the reason, need to check back.

Still, what prevents stating the existing dependency of lib on
libxenomai? The dance above would than be obsolete.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] Built-in libxenomai dependency

2012-02-01 Thread Jan Kiszka
On 2012-02-01 16:17, Gilles Chanteperdrix wrote:
> On 02/01/2012 03:37 PM, Jan Kiszka wrote:
>> Hi,
>>
>> don't remember anymore: Is there any subtle reason that prevent a
>> change like
>>
>> diff --git a/src/skins/native/Makefile.am b/src/skins/native/Makefile.am
>> index 39eaaed..4cc8859 100644
>> --- a/src/skins/native/Makefile.am
>> +++ b/src/skins/native/Makefile.am
>> @@ -22,6 +22,9 @@ libnative_la_SOURCES = \
>>  wrappers.c \
>>  wrappers.h
>>  
>> +libnative_la_LIBADD = \
>> +../common/libxenomai.la
>> +
>>  libnative_la_CPPFLAGS = \
>>  @XENO_USER_CFLAGS@ \
>>  -I$(top_srcdir)/include
>> diff --git a/src/skins/rtdm/Makefile.am b/src/skins/rtdm/Makefile.am
>> index 8ad74be..2dc0a90 100644
>> --- a/src/skins/rtdm/Makefile.am
>> +++ b/src/skins/rtdm/Makefile.am
>> @@ -8,6 +8,9 @@ librtdm_la_SOURCES = \
>>  core.c \
>>  init.c
>>  
>> +librtdm_la_LIBADD = \
>> +../common/libxenomai.la
>> +
>>  librtdm_la_CPPFLAGS = \
>>  @XENO_USER_CFLAGS@ \
>>  -I$(top_srcdir)/include
>> diff --git a/src/testsuite/latency/Makefile.am 
>> b/src/testsuite/latency/Makefile.am
>> index c772c26..6534df5 100644
>> --- a/src/testsuite/latency/Makefile.am
>> +++ b/src/testsuite/latency/Makefile.am
>> @@ -14,5 +14,4 @@ latency_LDFLAGS = @XENO_USER_LDFLAGS@
>>  latency_LDADD = \
>>  ../../skins/native/libnative.la \
>>  ../../skins/rtdm/librtdm.la \
>> -../../skins/common/libxenomai.la \
>>  -lpthread -lm
>>
>> i.e. that we let the skin libraries depend on libxenomai and then remove
>> the explicit dependency from our binaries and the xeno-config output?
>> Is there some ordering issue again (we have -l before -lxenomai
>> in the ldflags).
>>
>> If possible, this would allow for things like dlopen("libnative.so").
> 
> It allows xeno-config result to work both with dynamic and static
> libraries. Static libraries have no dependency system, so, when linking
> a program whith libnative.a for instance, without libtool, you still
> have to link it with libxenomai.a.

OK, part two could stay, but the dependencies should still be added to
the skin libs - if possible.

> 
> How come you can not dlopen libnative.so, dlopening libxenomai.so before
> does not work?

Dependencies of libnative on libxenomai are not resolved when you open
the former even if the latter is already loaded. Maybe you can do this
by pulling in all required symbols one by one manually, haven't tried
yet. But that would at least be unhandy.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


[Xenomai-core] Built-in libxenomai dependency

2012-02-01 Thread Jan Kiszka
Hi,

don't remember anymore: Is there any subtle reason that prevent a
change like

diff --git a/src/skins/native/Makefile.am b/src/skins/native/Makefile.am
index 39eaaed..4cc8859 100644
--- a/src/skins/native/Makefile.am
+++ b/src/skins/native/Makefile.am
@@ -22,6 +22,9 @@ libnative_la_SOURCES = \
wrappers.c \
wrappers.h
 
+libnative_la_LIBADD = \
+   ../common/libxenomai.la
+
 libnative_la_CPPFLAGS = \
@XENO_USER_CFLAGS@ \
-I$(top_srcdir)/include
diff --git a/src/skins/rtdm/Makefile.am b/src/skins/rtdm/Makefile.am
index 8ad74be..2dc0a90 100644
--- a/src/skins/rtdm/Makefile.am
+++ b/src/skins/rtdm/Makefile.am
@@ -8,6 +8,9 @@ librtdm_la_SOURCES = \
core.c \
init.c
 
+librtdm_la_LIBADD = \
+   ../common/libxenomai.la
+
 librtdm_la_CPPFLAGS = \
@XENO_USER_CFLAGS@ \
-I$(top_srcdir)/include
diff --git a/src/testsuite/latency/Makefile.am 
b/src/testsuite/latency/Makefile.am
index c772c26..6534df5 100644
--- a/src/testsuite/latency/Makefile.am
+++ b/src/testsuite/latency/Makefile.am
@@ -14,5 +14,4 @@ latency_LDFLAGS = @XENO_USER_LDFLAGS@
 latency_LDADD = \
../../skins/native/libnative.la \
../../skins/rtdm/librtdm.la \
-   ../../skins/common/libxenomai.la \
-lpthread -lm

i.e. that we let the skin libraries depend on libxenomai and then remove
the explicit dependency from our binaries and the xeno-config output?
Is there some ordering issue again (we have -l before -lxenomai
in the ldflags).

If possible, this would allow for things like dlopen("libnative.so").

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] [PATCH] Add sigdebug unit test

2012-01-26 Thread Jan Kiszka
On 2012-01-26 16:52, Gilles Chanteperdrix wrote:
> On 01/26/2012 04:06 PM, Jan Kiszka wrote:
>> On 2012-01-26 15:55, Gilles Chanteperdrix wrote:
>>> On 01/26/2012 11:36 AM, Jan Kiszka wrote:
>>>> On 2012-01-25 19:05, Jan Kiszka wrote:
>>>>> On 2012-01-25 18:44, Gilles Chanteperdrix wrote:
>>>>>> On 01/25/2012 06:10 PM, Jan Kiszka wrote:
>>>>>>> On 2012-01-25 18:02, Gilles Chanteperdrix wrote:
>>>>>>>> On 01/25/2012 05:52 PM, Jan Kiszka wrote:
>>>>>>>>> On 2012-01-25 17:47, Jan Kiszka wrote:
>>>>>>>>>> On 2012-01-25 17:35, Gilles Chanteperdrix wrote:
>>>>>>>>>>> On 01/25/2012 05:21 PM, Jan Kiszka wrote:
>>>>>>>>>>>> We had two regressions in this code recently. So test all 6 
>>>>>>>>>>>> possible
>>>>>>>>>>>> SIGDEBUG reasons, or 5 if the watchdog is not available.
>>>>>>>>>>>
>>>>>>>>>>> Ok for this test, with a few remarks:
>>>>>>>>>>> - this is a regression test, so should go to
>>>>>>>>>>> src/testsuite/regression(/native), and should be added to the
>>>>>>>>>>> xeno-regression-test
>>>>>>>>>>
>>>>>>>>>> What are unit test for (as they are defined here)? Looks a bit 
>>>>>>>>>> inconsistent.
>>>>>>>>
>>>>>>>> I put under "regression" all the tests I have which corresponded to
>>>>>>>> things that failed one time or another in xenomai past. Maybe we could
>>>>>>>> move unit tests under regression.
>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> - we already have a regression test for the watchdog called 
>>>>>>>>>>> mayday.c,
>>>>>>>>>>> which tests the second watchdog action, please merge mayday.c with
>>>>>>>>>>> sigdebug.c (mayday.c also allows checking the disassembly of the 
>>>>>>>>>>> code in
>>>>>>>>>>> the mayday page, a nice feature)
>>>>>>>>>>
>>>>>>>>>> It seems to have failed in that important last discipline. Need to 
>>>>>>>>>> check
>>>>>>>>>> why.
>>>>>>>>>
>>>>>>>>> Because it didn't check the page content for correctness. But that's 
>>>>>>>>> now
>>>>>>>>> done via the new watchdog test. I can keep the debug output, but the
>>>>>>>>> watchdog test of mayday looks obsolete to me. Am I missing something?
>>>>>>>>
>>>>>>>> The watchdog does two things: it first sends a SIGDEBUG, then if the
>>>>>>>> application is still spinning, it sends a SIGSEGV. As far as I
>>>>>>>> understood, you test tests the first case, and mayday tests the second
>>>>>>>> case, so, I agree that mayday should be removed, but whatever it tests
>>>>>>>> should be integrated in the sigdebug test.
>>>>>>>>
>>>>>>>
>>>>>>> Err... SIGSEGV is not a feature, it was the bug I fixed today. :) So the
>>>>>>> test case actually specified a bug as correct behavior.
>>>>>>>
>>>>>>> The fallback case is in fact killing the RT task as before. But I'm
>>>>>>> unsure right now: will this leave the system always in a clean state
>>>>>>> behind?
>>>>>>
>>>>>> The test case being a test case and doing nothing particular, I do not
>>>>>> see what could go wrong. And if something goes wrong, then it needs 
>>>>>> fixing.
>>>>>
>>>>> Well, if you kill a RT task while it's running in the kernel, you risk
>>>>> inconsistent system states (held mutexex etc.). In this case the task is
>>>>> supposed to spin in user space. If that is always safe, let's implement
>>>>> the test.
>>>>
>>>> Had a closer look: These days the two-stage killing is only useful to
>>>> catch endless loops in the kernel. User space tasks can't get around
>>>> being migrated on watchdog events, even when SIGDEBUG is ignored.
>>>>
>>>> To trigger the enforced task termination without leaving any broken
>>>> states behind, there is one option: rt_task_spin. Surprisingly for me,
>>>> it actually spins in the kernel, thus triggers the second level if
>>>> waiting long enough. I wonder, though, if that behavior shouldn't be
>>>> improved, ie. the spinning loop be closed in user space - which would
>>>> take away that option again.
>>>>
>>>> Thoughts?
>>>
>>> You can also call in an infinite loop, a xenomais syscall which causes a
>>> switch to primary mode, but fails.
>>
>> Nope, we would be migrated to secondary on xnthread_amok_p when
>> returning to user mode. We need a true kernel loop.
> 
> But the loop will continue, and the next call to the syscall will cause
> the thread to re-switch to primary mode.

But the "watchdog signal pending" flag will be cleared on each
migration, thus the watchdog will never enter the second stage.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] [PATCH] Add sigdebug unit test

2012-01-26 Thread Jan Kiszka
On 2012-01-26 15:36, Jan Kiszka wrote:
> Regarding testability of the second watchdog state: We could add a
> syscall to sigtest_module e.g which has the current rt_timer_spin
> semantics.

Do you think this makes sense? Or some other testing driver?
Then I would go that direction.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] [PATCH] Add sigdebug unit test

2012-01-26 Thread Jan Kiszka
On 2012-01-26 15:55, Gilles Chanteperdrix wrote:
> On 01/26/2012 11:36 AM, Jan Kiszka wrote:
>> On 2012-01-25 19:05, Jan Kiszka wrote:
>>> On 2012-01-25 18:44, Gilles Chanteperdrix wrote:
>>>> On 01/25/2012 06:10 PM, Jan Kiszka wrote:
>>>>> On 2012-01-25 18:02, Gilles Chanteperdrix wrote:
>>>>>> On 01/25/2012 05:52 PM, Jan Kiszka wrote:
>>>>>>> On 2012-01-25 17:47, Jan Kiszka wrote:
>>>>>>>> On 2012-01-25 17:35, Gilles Chanteperdrix wrote:
>>>>>>>>> On 01/25/2012 05:21 PM, Jan Kiszka wrote:
>>>>>>>>>> We had two regressions in this code recently. So test all 6 possible
>>>>>>>>>> SIGDEBUG reasons, or 5 if the watchdog is not available.
>>>>>>>>>
>>>>>>>>> Ok for this test, with a few remarks:
>>>>>>>>> - this is a regression test, so should go to
>>>>>>>>> src/testsuite/regression(/native), and should be added to the
>>>>>>>>> xeno-regression-test
>>>>>>>>
>>>>>>>> What are unit test for (as they are defined here)? Looks a bit 
>>>>>>>> inconsistent.
>>>>>>
>>>>>> I put under "regression" all the tests I have which corresponded to
>>>>>> things that failed one time or another in xenomai past. Maybe we could
>>>>>> move unit tests under regression.
>>>>>>
>>>>>>>>
>>>>>>>>> - we already have a regression test for the watchdog called mayday.c,
>>>>>>>>> which tests the second watchdog action, please merge mayday.c with
>>>>>>>>> sigdebug.c (mayday.c also allows checking the disassembly of the code 
>>>>>>>>> in
>>>>>>>>> the mayday page, a nice feature)
>>>>>>>>
>>>>>>>> It seems to have failed in that important last discipline. Need to 
>>>>>>>> check
>>>>>>>> why.
>>>>>>>
>>>>>>> Because it didn't check the page content for correctness. But that's now
>>>>>>> done via the new watchdog test. I can keep the debug output, but the
>>>>>>> watchdog test of mayday looks obsolete to me. Am I missing something?
>>>>>>
>>>>>> The watchdog does two things: it first sends a SIGDEBUG, then if the
>>>>>> application is still spinning, it sends a SIGSEGV. As far as I
>>>>>> understood, you test tests the first case, and mayday tests the second
>>>>>> case, so, I agree that mayday should be removed, but whatever it tests
>>>>>> should be integrated in the sigdebug test.
>>>>>>
>>>>>
>>>>> Err... SIGSEGV is not a feature, it was the bug I fixed today. :) So the
>>>>> test case actually specified a bug as correct behavior.
>>>>>
>>>>> The fallback case is in fact killing the RT task as before. But I'm
>>>>> unsure right now: will this leave the system always in a clean state
>>>>> behind?
>>>>
>>>> The test case being a test case and doing nothing particular, I do not
>>>> see what could go wrong. And if something goes wrong, then it needs fixing.
>>>
>>> Well, if you kill a RT task while it's running in the kernel, you risk
>>> inconsistent system states (held mutexex etc.). In this case the task is
>>> supposed to spin in user space. If that is always safe, let's implement
>>> the test.
>>
>> Had a closer look: These days the two-stage killing is only useful to
>> catch endless loops in the kernel. User space tasks can't get around
>> being migrated on watchdog events, even when SIGDEBUG is ignored.
>>
>> To trigger the enforced task termination without leaving any broken
>> states behind, there is one option: rt_task_spin. Surprisingly for me,
>> it actually spins in the kernel, thus triggers the second level if
>> waiting long enough. I wonder, though, if that behavior shouldn't be
>> improved, ie. the spinning loop be closed in user space - which would
>> take away that option again.
>>
>> Thoughts?
> 
> You can also call in an infinite loop, a xenomais syscall which causes a
> switch to primary mode, but fails.

Nope, we would be migrated to secondary on xnthread_amok_p when
returning to user mode. We need a true kernel loop.

Jan.

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] [PATCH] Add sigdebug unit test

2012-01-26 Thread Jan Kiszka
On 2012-01-26 13:56, Jan Kiszka wrote:
>>> To trigger the enforced task termination without leaving any broken
>>> states behind, there is one option: rt_task_spin. Surprisingly for me,
>>> it actually spins in the kernel, thus triggers the second level if
>>> waiting long enough. I wonder, though, if that behavior shouldn't be
>>> improved, ie. the spinning loop be closed in user space - which would
>>> take away that option again.
>>>
>>> Thoughts?
>>>
>>
>> Tick-based timing is going to be the problem for determining the
>> spinning delay, unless we expose it in the vdso on a per-skin basis,
>> which won't be pretty.
> 
> I see. But we should possibly add some signal-pending || amok test to
> that kernel loop. That would also kill my test design, but it makes
> otherwise some sense I guess.

I'm thinking of a change like this:

diff --git a/ksrc/skins/native/syscall.c b/ksrc/skins/native/syscall.c
index acf0375..39204b4 100644
--- a/ksrc/skins/native/syscall.c
+++ b/ksrc/skins/native/syscall.c
@@ -1020,13 +1020,21 @@ static int __rt_timer_inquire(struct pt_regs *regs)
 
 static int __rt_timer_spin(struct pt_regs *regs)
 {
+   xnthread_t *thread = xnpod_current_thread();
+   struct task_struct *p = current;
+   RTIME etime;
RTIME ns;
 
if (__xn_safe_copy_from_user(&ns, (void __user *)__xn_reg_arg1(regs),
 sizeof(ns)))
return -EFAULT;
 
-   rt_timer_spin(ns);
+   etime = xnarch_get_cpu_tsc() + xnarch_ns_to_tsc(ns);
+   while ((SRTIME)(xnarch_get_cpu_tsc() - etime) < 0) {
+   if (signal_pending(p) || xnthread_amok_p(thread))
+   return -EINTR;
+   cpu_relax();
+   }
 
return 0;
 }

Regarding testability of the second watchdog state: We could add a
syscall to sigtest_module e.g which has the current rt_timer_spin
semantics.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] [PATCH] Add sigdebug unit test

2012-01-26 Thread Jan Kiszka
On 2012-01-26 12:20, Philippe Gerum wrote:
> On 01/26/2012 11:36 AM, Jan Kiszka wrote:
>> On 2012-01-25 19:05, Jan Kiszka wrote:
>>> On 2012-01-25 18:44, Gilles Chanteperdrix wrote:
>>>> On 01/25/2012 06:10 PM, Jan Kiszka wrote:
>>>>> On 2012-01-25 18:02, Gilles Chanteperdrix wrote:
>>>>>> On 01/25/2012 05:52 PM, Jan Kiszka wrote:
>>>>>>> On 2012-01-25 17:47, Jan Kiszka wrote:
>>>>>>>> On 2012-01-25 17:35, Gilles Chanteperdrix wrote:
>>>>>>>>> On 01/25/2012 05:21 PM, Jan Kiszka wrote:
>>>>>>>>>> We had two regressions in this code recently. So test all 6
>>>>>>>>>> possible
>>>>>>>>>> SIGDEBUG reasons, or 5 if the watchdog is not available.
>>>>>>>>>
>>>>>>>>> Ok for this test, with a few remarks:
>>>>>>>>> - this is a regression test, so should go to
>>>>>>>>> src/testsuite/regression(/native), and should be added to the
>>>>>>>>> xeno-regression-test
>>>>>>>>
>>>>>>>> What are unit test for (as they are defined here)? Looks a bit
>>>>>>>> inconsistent.
>>>>>>
>>>>>> I put under "regression" all the tests I have which corresponded to
>>>>>> things that failed one time or another in xenomai past. Maybe we
>>>>>> could
>>>>>> move unit tests under regression.
>>>>>>
>>>>>>>>
>>>>>>>>> - we already have a regression test for the watchdog called
>>>>>>>>> mayday.c,
>>>>>>>>> which tests the second watchdog action, please merge mayday.c with
>>>>>>>>> sigdebug.c (mayday.c also allows checking the disassembly of
>>>>>>>>> the code in
>>>>>>>>> the mayday page, a nice feature)
>>>>>>>>
>>>>>>>> It seems to have failed in that important last discipline. Need
>>>>>>>> to check
>>>>>>>> why.
>>>>>>>
>>>>>>> Because it didn't check the page content for correctness. But
>>>>>>> that's now
>>>>>>> done via the new watchdog test. I can keep the debug output, but the
>>>>>>> watchdog test of mayday looks obsolete to me. Am I missing
>>>>>>> something?
>>>>>>
>>>>>> The watchdog does two things: it first sends a SIGDEBUG, then if the
>>>>>> application is still spinning, it sends a SIGSEGV. As far as I
>>>>>> understood, you test tests the first case, and mayday tests the
>>>>>> second
>>>>>> case, so, I agree that mayday should be removed, but whatever it
>>>>>> tests
>>>>>> should be integrated in the sigdebug test.
>>>>>>
>>>>>
>>>>> Err... SIGSEGV is not a feature, it was the bug I fixed today. :)
>>>>> So the
>>>>> test case actually specified a bug as correct behavior.
>>>>>
>>>>> The fallback case is in fact killing the RT task as before. But I'm
>>>>> unsure right now: will this leave the system always in a clean state
>>>>> behind?
>>>>
>>>> The test case being a test case and doing nothing particular, I do not
>>>> see what could go wrong. And if something goes wrong, then it needs
>>>> fixing.
>>>
>>> Well, if you kill a RT task while it's running in the kernel, you risk
>>> inconsistent system states (held mutexex etc.). In this case the task is
>>> supposed to spin in user space. If that is always safe, let's implement
>>> the test.
>>
>> Had a closer look: These days the two-stage killing is only useful to
>> catch endless loops in the kernel. User space tasks can't get around
>> being migrated on watchdog events, even when SIGDEBUG is ignored.
>>
>> To trigger the enforced task termination without leaving any broken
>> states behind, there is one option: rt_task_spin. Surprisingly for me,
>> it actually spins in the kernel, thus triggers the second level if
>> waiting long enough. I wonder, though, if that behavior shouldn't be
>> improved, ie. the spinning loop be closed in user space - which would
>> take away that option again.
>>
>> Thoughts?
>>
> 
> Tick-based timing is going to be the problem for determining the
> spinning delay, unless we expose it in the vdso on a per-skin basis,
> which won't be pretty.

I see. But we should possibly add some signal-pending || amok test to
that kernel loop. That would also kill my test design, but it makes
otherwise some sense I guess.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] [PATCH] Add sigdebug unit test

2012-01-26 Thread Jan Kiszka
On 2012-01-25 19:05, Jan Kiszka wrote:
> On 2012-01-25 18:44, Gilles Chanteperdrix wrote:
>> On 01/25/2012 06:10 PM, Jan Kiszka wrote:
>>> On 2012-01-25 18:02, Gilles Chanteperdrix wrote:
>>>> On 01/25/2012 05:52 PM, Jan Kiszka wrote:
>>>>> On 2012-01-25 17:47, Jan Kiszka wrote:
>>>>>> On 2012-01-25 17:35, Gilles Chanteperdrix wrote:
>>>>>>> On 01/25/2012 05:21 PM, Jan Kiszka wrote:
>>>>>>>> We had two regressions in this code recently. So test all 6 possible
>>>>>>>> SIGDEBUG reasons, or 5 if the watchdog is not available.
>>>>>>>
>>>>>>> Ok for this test, with a few remarks:
>>>>>>> - this is a regression test, so should go to
>>>>>>> src/testsuite/regression(/native), and should be added to the
>>>>>>> xeno-regression-test
>>>>>>
>>>>>> What are unit test for (as they are defined here)? Looks a bit 
>>>>>> inconsistent.
>>>>
>>>> I put under "regression" all the tests I have which corresponded to
>>>> things that failed one time or another in xenomai past. Maybe we could
>>>> move unit tests under regression.
>>>>
>>>>>>
>>>>>>> - we already have a regression test for the watchdog called mayday.c,
>>>>>>> which tests the second watchdog action, please merge mayday.c with
>>>>>>> sigdebug.c (mayday.c also allows checking the disassembly of the code in
>>>>>>> the mayday page, a nice feature)
>>>>>>
>>>>>> It seems to have failed in that important last discipline. Need to check
>>>>>> why.
>>>>>
>>>>> Because it didn't check the page content for correctness. But that's now
>>>>> done via the new watchdog test. I can keep the debug output, but the
>>>>> watchdog test of mayday looks obsolete to me. Am I missing something?
>>>>
>>>> The watchdog does two things: it first sends a SIGDEBUG, then if the
>>>> application is still spinning, it sends a SIGSEGV. As far as I
>>>> understood, you test tests the first case, and mayday tests the second
>>>> case, so, I agree that mayday should be removed, but whatever it tests
>>>> should be integrated in the sigdebug test.
>>>>
>>>
>>> Err... SIGSEGV is not a feature, it was the bug I fixed today. :) So the
>>> test case actually specified a bug as correct behavior.
>>>
>>> The fallback case is in fact killing the RT task as before. But I'm
>>> unsure right now: will this leave the system always in a clean state
>>> behind?
>>
>> The test case being a test case and doing nothing particular, I do not
>> see what could go wrong. And if something goes wrong, then it needs fixing.
> 
> Well, if you kill a RT task while it's running in the kernel, you risk
> inconsistent system states (held mutexex etc.). In this case the task is
> supposed to spin in user space. If that is always safe, let's implement
> the test.

Had a closer look: These days the two-stage killing is only useful to
catch endless loops in the kernel. User space tasks can't get around
being migrated on watchdog events, even when SIGDEBUG is ignored.

To trigger the enforced task termination without leaving any broken
states behind, there is one option: rt_task_spin. Surprisingly for me,
it actually spins in the kernel, thus triggers the second level if
waiting long enough. I wonder, though, if that behavior shouldn't be
improved, ie. the spinning loop be closed in user space - which would
take away that option again.

Thoughts?

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] [PATCH] Add sigdebug unit test

2012-01-25 Thread Jan Kiszka
On 2012-01-25 18:44, Gilles Chanteperdrix wrote:
> On 01/25/2012 06:10 PM, Jan Kiszka wrote:
>> On 2012-01-25 18:02, Gilles Chanteperdrix wrote:
>>> On 01/25/2012 05:52 PM, Jan Kiszka wrote:
>>>> On 2012-01-25 17:47, Jan Kiszka wrote:
>>>>> On 2012-01-25 17:35, Gilles Chanteperdrix wrote:
>>>>>> On 01/25/2012 05:21 PM, Jan Kiszka wrote:
>>>>>>> We had two regressions in this code recently. So test all 6 possible
>>>>>>> SIGDEBUG reasons, or 5 if the watchdog is not available.
>>>>>>
>>>>>> Ok for this test, with a few remarks:
>>>>>> - this is a regression test, so should go to
>>>>>> src/testsuite/regression(/native), and should be added to the
>>>>>> xeno-regression-test
>>>>>
>>>>> What are unit test for (as they are defined here)? Looks a bit 
>>>>> inconsistent.
>>>
>>> I put under "regression" all the tests I have which corresponded to
>>> things that failed one time or another in xenomai past. Maybe we could
>>> move unit tests under regression.
>>>
>>>>>
>>>>>> - we already have a regression test for the watchdog called mayday.c,
>>>>>> which tests the second watchdog action, please merge mayday.c with
>>>>>> sigdebug.c (mayday.c also allows checking the disassembly of the code in
>>>>>> the mayday page, a nice feature)
>>>>>
>>>>> It seems to have failed in that important last discipline. Need to check
>>>>> why.
>>>>
>>>> Because it didn't check the page content for correctness. But that's now
>>>> done via the new watchdog test. I can keep the debug output, but the
>>>> watchdog test of mayday looks obsolete to me. Am I missing something?
>>>
>>> The watchdog does two things: it first sends a SIGDEBUG, then if the
>>> application is still spinning, it sends a SIGSEGV. As far as I
>>> understood, you test tests the first case, and mayday tests the second
>>> case, so, I agree that mayday should be removed, but whatever it tests
>>> should be integrated in the sigdebug test.
>>>
>>
>> Err... SIGSEGV is not a feature, it was the bug I fixed today. :) So the
>> test case actually specified a bug as correct behavior.
>>
>> The fallback case is in fact killing the RT task as before. But I'm
>> unsure right now: will this leave the system always in a clean state
>> behind?
> 
> The test case being a test case and doing nothing particular, I do not
> see what could go wrong. And if something goes wrong, then it needs fixing.

Well, if you kill a RT task while it's running in the kernel, you risk
inconsistent system states (held mutexex etc.). In this case the task is
supposed to spin in user space. If that is always safe, let's implement
the test.

Jan



signature.asc
Description: OpenPGP digital signature
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] [PATCH] Add sigdebug unit test

2012-01-25 Thread Jan Kiszka
On 2012-01-25 18:02, Gilles Chanteperdrix wrote:
> On 01/25/2012 05:52 PM, Jan Kiszka wrote:
>> On 2012-01-25 17:47, Jan Kiszka wrote:
>>> On 2012-01-25 17:35, Gilles Chanteperdrix wrote:
>>>> On 01/25/2012 05:21 PM, Jan Kiszka wrote:
>>>>> We had two regressions in this code recently. So test all 6 possible
>>>>> SIGDEBUG reasons, or 5 if the watchdog is not available.
>>>>
>>>> Ok for this test, with a few remarks:
>>>> - this is a regression test, so should go to
>>>> src/testsuite/regression(/native), and should be added to the
>>>> xeno-regression-test
>>>
>>> What are unit test for (as they are defined here)? Looks a bit inconsistent.
> 
> I put under "regression" all the tests I have which corresponded to
> things that failed one time or another in xenomai past. Maybe we could
> move unit tests under regression.
> 
>>>
>>>> - we already have a regression test for the watchdog called mayday.c,
>>>> which tests the second watchdog action, please merge mayday.c with
>>>> sigdebug.c (mayday.c also allows checking the disassembly of the code in
>>>> the mayday page, a nice feature)
>>>
>>> It seems to have failed in that important last discipline. Need to check
>>> why.
>>
>> Because it didn't check the page content for correctness. But that's now
>> done via the new watchdog test. I can keep the debug output, but the
>> watchdog test of mayday looks obsolete to me. Am I missing something?
> 
> The watchdog does two things: it first sends a SIGDEBUG, then if the
> application is still spinning, it sends a SIGSEGV. As far as I
> understood, you test tests the first case, and mayday tests the second
> case, so, I agree that mayday should be removed, but whatever it tests
> should be integrated in the sigdebug test.
> 

Err... SIGSEGV is not a feature, it was the bug I fixed today. :) So the
test case actually specified a bug as correct behavior.

The fallback case is in fact killing the RT task as before. But I'm
unsure right now: will this leave the system always in a clean state
behind? The task killing of the watchdog is thought as a last resort to
keep the system analyzable, but I have a bad feeling if we want this in
a regular test case.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] [PATCH] Add sigdebug unit test

2012-01-25 Thread Jan Kiszka
On 2012-01-25 17:47, Jan Kiszka wrote:
> On 2012-01-25 17:35, Gilles Chanteperdrix wrote:
>> On 01/25/2012 05:21 PM, Jan Kiszka wrote:
>>> We had two regressions in this code recently. So test all 6 possible
>>> SIGDEBUG reasons, or 5 if the watchdog is not available.
>>
>> Ok for this test, with a few remarks:
>> - this is a regression test, so should go to
>> src/testsuite/regression(/native), and should be added to the
>> xeno-regression-test
> 
> What are unit test for (as they are defined here)? Looks a bit inconsistent.
> 
>> - we already have a regression test for the watchdog called mayday.c,
>> which tests the second watchdog action, please merge mayday.c with
>> sigdebug.c (mayday.c also allows checking the disassembly of the code in
>> the mayday page, a nice feature)
> 
> It seems to have failed in that important last discipline. Need to check
> why.

Because it didn't check the page content for correctness. But that's now
done via the new watchdog test. I can keep the debug output, but the
watchdog test of mayday looks obsolete to me. Am I missing something?

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] [PATCH] Add sigdebug unit test

2012-01-25 Thread Jan Kiszka
On 2012-01-25 17:35, Gilles Chanteperdrix wrote:
> On 01/25/2012 05:21 PM, Jan Kiszka wrote:
>> We had two regressions in this code recently. So test all 6 possible
>> SIGDEBUG reasons, or 5 if the watchdog is not available.
> 
> Ok for this test, with a few remarks:
> - this is a regression test, so should go to
> src/testsuite/regression(/native), and should be added to the
> xeno-regression-test

What are unit test for (as they are defined here)? Looks a bit inconsistent.

> - we already have a regression test for the watchdog called mayday.c,
> which tests the second watchdog action, please merge mayday.c with
> sigdebug.c (mayday.c also allows checking the disassembly of the code in
> the mayday page, a nice feature)

It seems to have failed in that important last discipline. Need to check
why.

> - please make the watchdog test mandatory by default (adding a command
> line option to skip it for instance), the test should fail if the
> watchdog is not enabled, because otherwise, it will be easy to forget
> testing this feature. The wathdog is enabled by default with xenomai 2.6
> anyway.

OK.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


[Xenomai-core] [PATCH] Add sigdebug unit test

2012-01-25 Thread Jan Kiszka
We had two regressions in this code recently. So test all 6 possible
SIGDEBUG reasons, or 5 if the watchdog is not available.

Signed-off-by: Jan Kiszka 
---
 src/testsuite/unit/Makefile.am |   16 +++-
 src/testsuite/unit/sigdebug.c  |  233 
 2 files changed, 248 insertions(+), 1 deletions(-)
 create mode 100644 src/testsuite/unit/sigdebug.c

diff --git a/src/testsuite/unit/Makefile.am b/src/testsuite/unit/Makefile.am
index 1bf5d8d..6e8203d 100644
--- a/src/testsuite/unit/Makefile.am
+++ b/src/testsuite/unit/Makefile.am
@@ -11,7 +11,8 @@ test_PROGRAMS = \
cond-torture-posix \
cond-torture-native \
check-vdso \
-   rtdm
+   rtdm \
+   sigdebug
 
 arith_SOURCES = arith.c arith-noinline.c arith-noinline.h
 
@@ -119,3 +120,16 @@ rtdm_LDADD = \
../../skins/native/libnative.la \
../../skins/common/libxenomai.la \
-lpthread -lrt -lm
+
+sigdebug_SOURCES = sigdebug.c
+
+sigdebug_CPPFLAGS = \
+   @XENO_USER_CFLAGS@ \
+   -I$(top_srcdir)/include
+
+sigdebug_LDFLAGS = @XENO_USER_LDFLAGS@
+
+sigdebug_LDADD = \
+   ../../skins/native/libnative.la \
+   ../../skins/common/libxenomai.la \
+   -lpthread -lm
diff --git a/src/testsuite/unit/sigdebug.c b/src/testsuite/unit/sigdebug.c
new file mode 100644
index 000..57d9beb
--- /dev/null
+++ b/src/testsuite/unit/sigdebug.c
@@ -0,0 +1,233 @@
+/*
+ * Functional testing of unwanted domain switch debugging mechanism.
+ *
+ * Copyright (C) Jan Kiszka  
+ *
+ * Released under the terms of GPLv2.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define WRITE_TEST_SIZE(4*1024)
+
+unsigned int expected_reason;
+bool sigdebug_received;
+pthread_t rt_task_thread;
+RT_MUTEX prio_invert;
+RT_SEM send_signal;
+char *mem;
+FILE *wd;
+
+static void setup_checkdebug(unsigned int reason)
+{
+   sigdebug_received = false;
+   expected_reason = reason;
+}
+
+static void check_inner(const char *fn, int line, const char *msg,
+   int status, int expected)
+{
+   if (status == expected)
+   return;
+
+   rt_task_set_mode(T_WARNSW, 0, NULL);
+   rt_print_flush_buffers();
+   fprintf(stderr, "FAILURE %s:%d: %s returned %d instead of %d - %s\n",
+   fn, line, msg, status, expected, strerror(-status));
+   exit(EXIT_FAILURE);
+}
+
+static void check_sigdebug_inner(const char *fn, int line, const char *reason)
+{
+   if (sigdebug_received)
+   return;
+
+   rt_task_set_mode(T_WARNSW, 0, NULL);
+   rt_print_flush_buffers();
+   fprintf(stderr, "FAILURE %s:%d: no %s received\n", fn, line, reason);
+   exit(EXIT_FAILURE);
+}
+
+#define check(msg, status, expected) ({
\
+   int __status = status;  \
+   check_inner(__FUNCTION__, __LINE__, msg, __status, expected);   \
+   __status;   \
+})
+
+#define check_no_error(msg, status) ({ \
+   int __status = status;  \
+   check_inner(__FUNCTION__, __LINE__, msg,\
+   __status < 0 ? __status : 0, 0);\
+   __status;   \
+})
+
+#define check_sigdebug_received(reason) do {   \
+   const char *__reason = reason;  \
+   check_sigdebug_inner(__FUNCTION__, __LINE__, __reason); \
+} while (0)
+
+void rt_task_body(void *cookie)
+{
+   RTIME end;
+   int err;
+
+   rt_task_thread = pthread_self();
+
+   rt_printf("syscall\n");
+   setup_checkdebug(SIGDEBUG_MIGRATE_SYSCALL);
+   syscall(-1);
+   check_sigdebug_received("SIGDEBUG_MIGRATE_SYSCALL");
+
+   rt_printf("signal\n");
+   setup_checkdebug(SIGDEBUG_MIGRATE_SIGNAL);
+   err = rt_sem_v(&send_signal);
+   check_no_error("rt_sem_v", err);
+   rt_task_sleep(1000LL);
+   check_sigdebug_received("SIGDEBUG_MIGRATE_SIGNAL");
+
+   rt_printf("relaxed mutex owner\n");
+   setup_checkdebug(SIGDEBUG_MIGRATE_PRIOINV);
+   err = rt_mutex_acquire(&prio_invert, TM_INFINITE);
+   check("rt_mutex_acquire", err, -EINTR);
+   check_sigdebug_received("SIGDEBUG_MIGRATE_PRIOINV");
+
+
+   rt_printf("page fault\n");
+   setup_checkdebug(SIGDEBUG_MIGRATE_FAULT);
+   rt_task_sleep(0);
+   *mem ^= 0xFF;
+   check_sigdebug_received("SIGDEBUG_MIGRATE_FAULT");
+
+   if (wd) {
+   rt_printf("watchdog\n");
+   rt_pri

[Xenomai-core] [PATCH 2.6] mayday: Fix code setup for x86 and blackfin

2012-01-25 Thread Jan Kiszka
The code structures on x86 were broken as the compiler aligned the
internal layout. The same may have happened on blackfin. Fix it by
applying a packed tag on the enclosing structures.

Signed-off-by: Jan Kiszka 
---

Haven't checked Xenomai 3 yet, it may be affected as well.

 include/asm-blackfin/bits/shadow.h |2 +-
 include/asm-x86/bits/shadow_32.h   |4 ++--
 include/asm-x86/bits/shadow_64.h   |2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/include/asm-blackfin/bits/shadow.h 
b/include/asm-blackfin/bits/shadow.h
index 3f7e4eb..58cfdc6 100644
--- a/include/asm-blackfin/bits/shadow.h
+++ b/include/asm-blackfin/bits/shadow.h
@@ -84,7 +84,7 @@ static inline void xnarch_setup_mayday_page(void *page)
 * We don't mess with ASTAT here, so no need to save/restore
 * it in handle/fixup code.
 */
-   static const struct {
+   static const struct __attribute__ ((__packed__)) {
struct __attribute__ ((__packed__)) {
u16 op;
u16 imm;
diff --git a/include/asm-x86/bits/shadow_32.h b/include/asm-x86/bits/shadow_32.h
index b7a0e87..43fc2d0 100644
--- a/include/asm-x86/bits/shadow_32.h
+++ b/include/asm-x86/bits/shadow_32.h
@@ -70,7 +70,7 @@ static inline void xnarch_setup_mayday_page(void *page)
 * Also note that if SEP is present, we always assume NPTL on
 * the user side.
 */
-   static const struct {
+   static const struct __attribute__ ((__packed__)) {
struct __attribute__ ((__packed__)) {
u8 op;
u32 imm;
@@ -94,7 +94,7 @@ static inline void xnarch_setup_mayday_page(void *page)
.bug = 0x0b0f,
};
 
-   static const struct {
+   static const struct __attribute__ ((__packed__)) {
struct __attribute__ ((__packed__)) {
u8 op;
u32 imm;
diff --git a/include/asm-x86/bits/shadow_64.h b/include/asm-x86/bits/shadow_64.h
index fc90b9e..3fa6473 100644
--- a/include/asm-x86/bits/shadow_64.h
+++ b/include/asm-x86/bits/shadow_64.h
@@ -66,7 +66,7 @@ static inline void xnarch_setup_mayday_page(void *page)
 * We intentionally don't mess with EFLAGS here, so that we
 * don't have to save/restore it in handle/fixup code.
 */
-   static const struct {
+   static const struct __attribute__ ((__packed__)) {
struct __attribute__ ((__packed__)) {
u8 op;
u32 imm;
-- 
1.7.3.4

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


[Xenomai-core] [PATCH] nucleus: Fix relaxed owner check

2012-01-19 Thread Jan Kiszka
Fix a logic inversion introduced by b75cec1938. This both allows the
relaxed-owner check to work again and prevents that we enter an endless
signal storm if XNSWREP happens to be set already at this point (as seen
in the field).

Signed-off-by: Jan Kiszka 
---
 ksrc/nucleus/synch.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/ksrc/nucleus/synch.c b/ksrc/nucleus/synch.c
index 47bc0c5..8daf7f6 100644
--- a/ksrc/nucleus/synch.c
+++ b/ksrc/nucleus/synch.c
@@ -1004,7 +1004,7 @@ EXPORT_SYMBOL_GPL(xnsynch_release_all_ownerships);
 void xnsynch_detect_relaxed_owner(struct xnsynch *synch, struct xnthread 
*sleeper)
 {
if (xnthread_test_state(sleeper, XNTRAPSW) &&
-   xnthread_test_info(sleeper, XNSWREP) &&
+   !xnthread_test_info(sleeper, XNSWREP) &&
xnthread_test_state(synch->owner, XNRELAX)) {
xnthread_set_info(sleeper, XNSWREP);
xnshadow_send_sig(sleeper, SIGDEBUG,
-- 
1.7.3.4

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] [PULL] KVM awareness / tiny cleanup

2011-12-10 Thread Jan Kiszka
On 2011-10-14 13:20, Jan Kiszka wrote:
> The following changes since commit 31bdadef9018de586bc3fe8de0f37b62b2507785:
> 
>   testsuite: fix regression tests location (2011-10-14 01:46:08 +0200)
> 
> are available in the git repository at:
>   git://git.xenomai.org/xenomai-jki.git for-upstream
> 
> Jan Kiszka (2):
>   nucleus: Privatize __xnpod_reset_thread
>   x86: Fire root preemption notifier
> 
>  include/asm-generic/hal.h |5 +
>  include/asm-x86/bits/pod_32.h |2 ++
>  include/asm-x86/bits/pod_64.h |2 ++
>  include/nucleus/pod.h |2 --
>  ksrc/nucleus/pod.c|8 
>  5 files changed, 13 insertions(+), 6 deletions(-)
> 
> Two trivial changes, though the second one actually adds a noteworthy
> feature to Xenomai: co-location of KVM guests on x86. That feature
> depends on I-pipe changes posted in [1].
> 
> Jan
> 
> [1] http://thread.gmane.org/gmane.linux.kernel.adeos.general/1898
> 

Ping?

I've just pushed a trivial rebase.

Jan



signature.asc
Description: OpenPGP digital signature
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] [PULL] KVM awareness / tiny cleanup

2011-10-14 Thread Jan Kiszka
On 2011-10-14 14:35, Gilles Chanteperdrix wrote:
> On 10/14/2011 01:20 PM, Jan Kiszka wrote:
>> The following changes since commit 31bdadef9018de586bc3fe8de0f37b62b2507785:
>>
>>   testsuite: fix regression tests location (2011-10-14 01:46:08 +0200)
>>
>> are available in the git repository at:
>>   git://git.xenomai.org/xenomai-jki.git for-upstream
>>
>> Jan Kiszka (2):
>>   nucleus: Privatize __xnpod_reset_thread
> 
> Are we sure that we do not have any out-of-tree users for this feature?

Out-of-tree would be impossible as that function was not exported.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


[Xenomai-core] [PULL] KVM awareness / tiny cleanup

2011-10-14 Thread Jan Kiszka
The following changes since commit 31bdadef9018de586bc3fe8de0f37b62b2507785:

  testsuite: fix regression tests location (2011-10-14 01:46:08 +0200)

are available in the git repository at:
  git://git.xenomai.org/xenomai-jki.git for-upstream

Jan Kiszka (2):
  nucleus: Privatize __xnpod_reset_thread
  x86: Fire root preemption notifier

 include/asm-generic/hal.h |5 +
 include/asm-x86/bits/pod_32.h |2 ++
 include/asm-x86/bits/pod_64.h |2 ++
 include/nucleus/pod.h |2 --
 ksrc/nucleus/pod.c|8 
 5 files changed, 13 insertions(+), 6 deletions(-)

Two trivial changes, though the second one actually adds a noteworthy
feature to Xenomai: co-location of KVM guests on x86. That feature
depends on I-pipe changes posted in [1].

Jan

[1] http://thread.gmane.org/gmane.linux.kernel.adeos.general/1898

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] [RFC 0/1] Class driver for raw Ethernet packets

2011-09-28 Thread Jan Kiszka
On 2011-09-28 10:16, Richard Cochran wrote:
> On Tue, Sep 27, 2011 at 07:25:07PM +0200, Jan Kiszka wrote:
>>
>> It manages buffers for you, provides NIC drivers and interfaces to
>> either build the higher protocol layers in the kernel or in user space.
>> That's the mission, but I would not exclude that there is room for
>> improvements (lacking safe copy-to/from user, unneeded RX thread for
>> single-user scenarios and possibly more). Still, the Ethercat master
>> library folks chose it back then as a platform, maybe you want to ask them.
> 
> Getting a little off topic, I wonder, who are these "Ethercat folks" of
> whom you speak?
> 
> I do know of a few open source implementations, but none are based
> Xenomai:
> 
> * IgH etherlab
>   This is a complete stack in the kernel. Although they claim it works
>   with Xenomai, in fact it does not, since it uses regular kernel spin
>   locks, etc. However, it could be adapted to work with Xenomai.
> * SOEM
>   This is a user space (really simple) stack based on normal raw
>   sockets. It could also be adapted to use Xenomai, by adding some
>   sort of raw RT socket.
> * OSADL
>   This was withdrawn because of license concerns. I never saw the
>   code, but I do beleive it was a user space solution using standard
>   sockets.
> * Another?
>   There once was some C++ program from some institute in the
>   Netherlands (or Belgium? can't remember), but it was also withdrawn.
> 
> So, did you mean any of these, or yet another implementation?

I meat the first hit for "ethercat master library" in a search engine. See

http://ethercatmaster.berlios.de/
http://www.fmtc.be/downloads/15_FMTC%20open%20sources.pdf

or ask Peter or Klaas (pe...@thesourceworks.com, klaas.gade...@fmtc.be -
not sure if Klaas is still with fmtc).

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] [RFC 0/1] Class driver for raw Ethernet packets

2011-09-27 Thread Jan Kiszka
On 2011-09-27 19:04, Richard Cochran wrote:
> On Tue, Sep 27, 2011 at 06:30:00PM +0200, Jan Kiszka wrote:
>> On 2011-09-27 18:26, Jan Kiszka wrote:
>>> I was simply hoping to collect some new ideas how to address the driver
>>> maintenance issue in a better way but without dropping key features
>>> needed for RT networking. Something like "let's add generic RT channels
>>> to Linux upstream drivers and then only patch them fit RTDM". Not sure
>>> if that works, but it would come with a vision how to keep things more
>>> maintainable.
>>
>> +this could be useful for other scenarios - on PREEMPT-RT.
> 
> (But PREEMPT-RT will make the whole kernel deterministic, right? ;)

Yes. Except where not.

> 
> Adding low-latency channels (eg. working against coalescing) will be a
> very hard sell upstream.

That also depends on the invasiveness. Key requirements like separate
packet pools slowly sneak in (for swap over net). Also, separate IRQs
for separate channels on modern NICs may make the split-out smoother -
you may no longer need to disable features that used to affect the same
handler or shared some locks.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] [RFC 0/1] Class driver for raw Ethernet packets

2011-09-27 Thread Jan Kiszka
On 2011-09-27 19:00, Richard Cochran wrote:
> On Tue, Sep 27, 2011 at 06:26:44PM +0200, Jan Kiszka wrote:
>> On 2011-09-27 18:05, Richard Cochran wrote:
>>
>> That's a common misunderstanding: RTnet is a networking stack with many
>> _optional_ components (like RTmac, RTcfg etc.). I would bet that it's
>> more frequently used today in minimal setups, i.e. just the core, some
>> driver, and either PF_PACKET or UDP/IP.
> 
> I understood about the modular design, but I really want to know if
> rtnet will help me if I want to use of the industrial Ethernet
> protocols. AFAICT, rtnet really doesn't offer these.
> 
> So I'll ask the direct question once again. Does rtnet help me with
> industrial Ethernet (apart from the rtnet protocols), or not?

It manages buffers for you, provides NIC drivers and interfaces to
either build the higher protocol layers in the kernel or in user space.
That's the mission, but I would not exclude that there is room for
improvements (lacking safe copy-to/from user, unneeded RX thread for
single-user scenarios and possibly more). Still, the Ethercat master
library folks chose it back then as a platform, maybe you want to ask them.

> 
>>> Unless rtnet implements (or helps to implement) these, it is kind of
>>> silly to say, "your way won't work, you should use rtnet instead."
>>>
>>> I don't know PowerLink or Profinet, but I do know EtherCAT and IEC
>>> 61850, and those two can surely be implemented on the interface that I
>>> am talking about.
>>
>> It works, but it won't give you a deterministic control loop as you
>> still have Linux in the game.
> 
> It really depends on how the driver is written. While my gianfar
> example does make use of normal Linux driver interrupts, it would not
> necessarily have to do so.
> 
>> I was simply hoping to collect some new ideas how to address the driver
>> maintenance issue in a better way but without dropping key features
>> needed for RT networking. Something like "let's add generic RT channels
>> to Linux upstream drivers and then only patch them fit RTDM". Not sure
>> if that works, but it would come with a vision how to keep things more
>> maintainable.
> 
> Well, can you turn the issue around and convince me that writing a
> rtnet driver is the best way to acheive raw Ethernet packet access?

It would at least avoid having to reinvent user interfaces and buffer
management layers. They may appear simple now, but that's how everything
once started.

> 
> You talk about the rtnet driver model, but is it described anywhere?
> 
> (BTW rtnet/Documentation/README.drvporting is horrible. It is just a
> random list of 40+ odd points without any sense. That document gave me
> the impression that developing an rtnet driver is a kind of extended
> "hack until it starts working.")

I know best. It once worked well, allowed you to more or less
mechanically convert a driver within a few hours, but that's quite a few
years ago. Driver complexity almost exploded since then.

But my goal is not necessarily convincing you to use current RTnet as
is, but to make you think ahead, reusing the experience of this project
if you start something new. That need not replace RTnet immediately, but
it should not block a transition architecturally. The first step to this
remains studying the legacy, even if not using it in the end.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] [RFC 0/1] Class driver for raw Ethernet packets

2011-09-27 Thread Jan Kiszka
On 2011-09-27 18:26, Jan Kiszka wrote:
> I was simply hoping to collect some new ideas how to address the driver
> maintenance issue in a better way but without dropping key features
> needed for RT networking. Something like "let's add generic RT channels
> to Linux upstream drivers and then only patch them fit RTDM". Not sure
> if that works, but it would come with a vision how to keep things more
> maintainable.

+this could be useful for other scenarios - on PREEMPT-RT.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] [RFC 0/1] Class driver for raw Ethernet packets

2011-09-27 Thread Jan Kiszka
On 2011-09-27 18:05, Richard Cochran wrote:
> On Tue, Sep 27, 2011 at 05:16:09PM +0200, Jan Kiszka wrote:
>> On 2011-09-27 17:10, Richard Cochran wrote:
>>> On Tue, Sep 27, 2011 at 02:20:43PM +0200, Jan Kiszka wrote:
>>>> On 2011-09-27 14:01, Richard Cochran wrote:
>>>>> Again, every MAC driver needs to be tastefully and wisely adapted. I
>>>>> don't necessarily need to avoid coalescing. The goal (for me) is *not*
>>>>> to provide deterministic Ethernet performance. Instead the RT packets
>>>>> should just be delivered ASAP.
>>>>
>>>> This is obviously the point I completely missed. And it makes the whole
>>>> thing fairly uninteresting IMHO. If you want to do Ethercat, PowerLink
>>>> or Profinet (RT), you do need a certain level of determinism along the
>>>> *whole* packet path. And for the latter two, you definitely need RT IRQ
>>>> support, Ethercat can be OK to poll in fast setups.
>>>>
>>>> From that POV, your approach is likely OK. But I doubt its of generic
>>>> use, specifically for industrial RT Ethernet.
>>>
>>> So, how does rtnet support EtherCAT?
>>
>> There was once the EtherCAT Master Library. IIRC, it was discontinued
>> and removed from the web for non-technical reasons.
>>
>>>
>>> Does it support PowerLink and Profinet?
>>
>> Not that I know, but that's not the point. You said your approach could
>> provide the technical foundation for such a class of use cases while I
>> doubt it would work as is.
> 
> But that is the point. Correct me please if I am wrong, but isn't
> rtnet a competitor or alternative to EtherCAT, PowerLink, and
> Profinet?

That's a common misunderstanding: RTnet is a networking stack with many
_optional_ components (like RTmac, RTcfg etc.). I would bet that it's
more frequently used today in minimal setups, i.e. just the core, some
driver, and either PF_PACKET or UDP/IP.

> 
> Unless rtnet implements (or helps to implement) these, it is kind of
> silly to say, "your way won't work, you should use rtnet instead."
> 
> I don't know PowerLink or Profinet, but I do know EtherCAT and IEC
> 61850, and those two can surely be implemented on the interface that I
> am talking about.

It works, but it won't give you a deterministic control loop as you
still have Linux in the game.

> 
> Also, the interface that I propose does not preclude the use of RTDM
> ISR in the driver, so I wonder what exactly bothers you about it?

That you claim it's simpler based on lacking features - like RT IRQ
support and the associated proper driver integration of that. Maybe it
turns out to be simpler, but that is by far proven yet.

I was simply hoping to collect some new ideas how to address the driver
maintenance issue in a better way but without dropping key features
needed for RT networking. Something like "let's add generic RT channels
to Linux upstream drivers and then only patch them fit RTDM". Not sure
if that works, but it would come with a vision how to keep things more
maintainable.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] [RFC 0/1] Class driver for raw Ethernet packets

2011-09-27 Thread Jan Kiszka
On 2011-09-27 17:10, Richard Cochran wrote:
> On Tue, Sep 27, 2011 at 02:20:43PM +0200, Jan Kiszka wrote:
>> On 2011-09-27 14:01, Richard Cochran wrote:
>>> Again, every MAC driver needs to be tastefully and wisely adapted. I
>>> don't necessarily need to avoid coalescing. The goal (for me) is *not*
>>> to provide deterministic Ethernet performance. Instead the RT packets
>>> should just be delivered ASAP.
>>
>> This is obviously the point I completely missed. And it makes the whole
>> thing fairly uninteresting IMHO. If you want to do Ethercat, PowerLink
>> or Profinet (RT), you do need a certain level of determinism along the
>> *whole* packet path. And for the latter two, you definitely need RT IRQ
>> support, Ethercat can be OK to poll in fast setups.
>>
>> From that POV, your approach is likely OK. But I doubt its of generic
>> use, specifically for industrial RT Ethernet.
> 
> So, how does rtnet support EtherCAT?

There was once the EtherCAT Master Library. IIRC, it was discontinued
and removed from the web for non-technical reasons.

> 
> Does it support PowerLink and Profinet?

Not that I know, but that's not the point. You said your approach could
provide the technical foundation for such a class of use cases while I
doubt it would work as is.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] [RFC 0/1] Class driver for raw Ethernet packets

2011-09-27 Thread Jan Kiszka
On 2011-09-27 14:01, Richard Cochran wrote:
> Again, every MAC driver needs to be tastefully and wisely adapted. I
> don't necessarily need to avoid coalescing. The goal (for me) is *not*
> to provide deterministic Ethernet performance. Instead the RT packets
> should just be delivered ASAP.

This is obviously the point I completely missed. And it makes the whole
thing fairly uninteresting IMHO. If you want to do Ethercat, PowerLink
or Profinet (RT), you do need a certain level of determinism along the
*whole* packet path. And for the latter two, you definitely need RT IRQ
support, Ethercat can be OK to poll in fast setups.

>From that POV, your approach is likely OK. But I doubt its of generic
use, specifically for industrial RT Ethernet.

Jan

PS: You do have a stack, even if you don't like it: driver, packet
layer, application. :)

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] [RFC 0/1] Class driver for raw Ethernet packets

2011-09-27 Thread Jan Kiszka
On 2011-09-26 13:41, Richard Cochran wrote:
> On Fri, Sep 23, 2011 at 03:50:42PM +0200, Jan Kiszka wrote:
>> On 2011-09-23 13:02, Richard Cochran wrote:
>>> This patch adds a class driver for raw Ethernet drivers under
>>> Xenomai. The goal is to support industrial protocols such as EtherCAT
>>> and IEC 61850, where the "stack" is a user space program needing
>>> direct access at the packet level. The class driver offers interfaces
>>> for registration, buffer management, and packet sending/receiving.
>>>
>>> Although this patch is a kind of first draft, still I have it working
>>> on the Freescale P2020 with a real world application, with very good
>>> results. I can post a patch series for the gianfar driver in the ipipe
>>> tree, if anyone is interested.
>>>
>>> The user space interface is a character device and not a socket, simply
>>> because my applications will probably never need fancy socket
>>> options. The class driver could surely be made to offer a socket
>>> instead, but I think the character is sufficient.
>>
>> Many interesting interfaces already exists for standard PF_PACKET (e.g.
>> time stamping), plus you gain portability this way. So let's not create
>> something special here even if it's sufficient for the first use case.
> 
> Okay, so if the raw packet driver idea finds acceptance (outside of
> rtnet), then I am willing to recast the thing as a socket.
>  
>>>
>>> The class driver is clearly in the wrong directory within the source
>>> tree, but I put it there just to get started. It really does not fit
>>> to any of the other drivers, so it probably would need its own place
>>> under ksrc/drivers.
>>
>> New class, new directory.
>>
>> However, the key question is how this approach relates to RTnet. Right
>> now its likely comparing apples to onions, but that may change as things
>> evolve in the right direction.
>>
>> Can you explain a bit about your NIC driver architecture and the
>> maintenance strategy? It looks like you are taking an approach of
>> patching existing drivers in-tree.
> 
> Yes, that is right.
> 
>> Anything that resolves the
>> maintenance pain we have with the RTnet model is already worth
>> considering. Did you take this approach intentionally? What pros and
>> cons do you see?
> 
> So, here is my story. Many months ago I needed a Xenomai program to
> send and receive raw packets. I looked at rtnet and decided that it
> was way too complicated. So, I added a rtdm character device to the
> Linux Ethernet MAC driver, a rather simple hack:
> 
>  drivers/net/arm/Makefile |4 +
>  drivers/net/arm/ixp4xx_eth.c |  316
>  +-
>  2 files changed, 319 insertions(+), 1 deletions(-)
> 
> That worked great. I did not bother to implement more than read/write,
> but I used a normal socket to set the multicast Rx list.
> 
> Now, I needed to port that application to a new platform. Again, I
> looked at rtnet (and the recently posted gianfar code), and again I
> came to the conclusion that I was better off with a direct driver
> hack.
> 
> Most of those 316 additional lines, above, were rtdm driver boiler
> plate, and I this time I wanted to abstract the code that will surely
> be the same, like device registration, user space send/receive, and
> buffer management. (Because the eTSEC (gianfar) is much more complex,
> and because I wanted hardware packet classification, the driver patch
> is much larger than before.)
> 
> I admire the rtnet project. Based on my limited understanding of it,
> if I really needed deterministic Ethernet, then I would write a brand
> new driver for the eTSEC (perhaps using gianfar.c as a reference), but
> I would *not* try and adapt the Linux driver.
> 
> Conversely, if I were writing a brand new driver, I would surely offer
> it as an rtnet driver, even thought I only need raw packets.
> 
> However, for the sole requirement of raw Ethernet, I think my simple
> driver API is much easier to work into existing drivers. I also think
> it is way easier to maintain by rebasing on later changes.
> 
> IMHO, the pros of my approach are:
> 
> - Simple to implement new drivers.
> 
>   Compare my rtpacket.h with the rtnet driver headers to see what I
>   mean. Or, read rtnet/Documentation/README.drvporting and ask
>   yourself, is it easy to port a driver to rtnet?

I would be careful with deriving generic properties from a potentially
lucky first example. Already tried to apply your pattern on a standard
PCI NIC, e.g. the common Intel 8257x series?


Re: [Xenomai-core] P2020 support for ftrace with ipipe 2.12-01 and xeno 2.5.5.1

2011-09-23 Thread Jan Kiszka
On 2011-09-23 16:21, Jean-Michel Hautbois wrote:
> 2011/9/23 Jan Kiszka mailto:jan.kis...@siemens.com>>
> On 2011-09-23 15:58, Jean-Michel Hautbois wrote:
>> 2011/9/23 Gilles Chanteperdrix 
>> mailto:gilles.chanteperd...@xenomai.org>>
>>
>>> On 09/23/2011 11:49 AM, Jean-Michel Hautbois wrote:
>>>> OK, I have more traces (a few :)) :
>>>
>>> I meant the I-pipe tracer alone. The I-pipe tracer intead of other
>>> ftrace tracers.
>>>
>>>
>> Well, I think it works, I can do a cat /proc/ipipe/trace/max without any
>> error in a kernel which doesn't have any other ftrace function.
>> Do you have one test in particular in mind ?
> 
> Check
> http://git.kiszka.org/?p=ipipe.git;a=shortlog;h=refs/heads/queues/2.6.35-x86-trace
> to get an impression of what is required to get ftrace working on
> x86_64. It is not working on x86_32 e.g. as it depends on the arch
> providing a NMI-safe, ie. task-stack independent current_thread_info().
> That is also not yet the case on Power.
> 
> Interesting, and impressive. Do you plan supporting it for PowerPC ?
> Is there a git branch for that too ?
> I can test on a P2020, if you are interested ;-).

Sorry, this is currently not on any local roadmap.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] P2020 support for ftrace with ipipe 2.12-01 and xeno 2.5.5.1

2011-09-23 Thread Jan Kiszka
On 2011-09-23 15:58, Jean-Michel Hautbois wrote:
> 2011/9/23 Gilles Chanteperdrix 
> 
>> On 09/23/2011 11:49 AM, Jean-Michel Hautbois wrote:
>>> OK, I have more traces (a few :)) :
>>
>> I meant the I-pipe tracer alone. The I-pipe tracer intead of other
>> ftrace tracers.
>>
>>
> Well, I think it works, I can do a cat /proc/ipipe/trace/max without any
> error in a kernel which doesn't have any other ftrace function.
> Do you have one test in particular in mind ?

Check
http://git.kiszka.org/?p=ipipe.git;a=shortlog;h=refs/heads/queues/2.6.35-x86-trace
to get an impression of what is required to get ftrace working on
x86_64. It is not working on x86_32 e.g. as it depends on the arch
providing a NMI-safe, ie. task-stack independent current_thread_info().
That is also not yet the case on Power.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] [RFC 0/1] Class driver for raw Ethernet packets

2011-09-23 Thread Jan Kiszka
On 2011-09-23 13:02, Richard Cochran wrote:
> This patch adds a class driver for raw Ethernet drivers under
> Xenomai. The goal is to support industrial protocols such as EtherCAT
> and IEC 61850, where the "stack" is a user space program needing
> direct access at the packet level. The class driver offers interfaces
> for registration, buffer management, and packet sending/receiving.
> 
> Although this patch is a kind of first draft, still I have it working
> on the Freescale P2020 with a real world application, with very good
> results. I can post a patch series for the gianfar driver in the ipipe
> tree, if anyone is interested.
> 
> The user space interface is a character device and not a socket, simply
> because my applications will probably never need fancy socket
> options. The class driver could surely be made to offer a socket
> instead, but I think the character is sufficient.

Many interesting interfaces already exists for standard PF_PACKET (e.g.
time stamping), plus you gain portability this way. So let's not create
something special here even if it's sufficient for the first use case.

> 
> The class driver is clearly in the wrong directory within the source
> tree, but I put it there just to get started. It really does not fit
> to any of the other drivers, so it probably would need its own place
> under ksrc/drivers.

New class, new directory.

However, the key question is how this approach relates to RTnet. Right
now its likely comparing apples to onions, but that may change as things
evolve in the right direction.

Can you explain a bit about your NIC driver architecture and the
maintenance strategy? It looks like you are taking an approach of
patching existing drivers in-tree. Anything that resolves the
maintenance pain we have with the RTnet model is already worth
considering. Did you take this approach intentionally? What pros and
cons do you see?

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] Policy switching and XNOTHER maintenance

2011-09-18 Thread Jan Kiszka
On 2011-09-18 17:18, Gilles Chanteperdrix wrote:
> On 09/18/2011 05:13 PM, Jan Kiszka wrote:
>> On 2011-09-18 17:11, Jan Kiszka wrote:
>>> On 2011-09-18 17:10, Philippe Gerum wrote:
>>>> On Sun, 2011-09-18 at 16:34 +0200, Jan Kiszka wrote:
>>>>> On 2011-09-18 16:02, Philippe Gerum wrote:
>>>>>> On Fri, 2011-09-16 at 22:39 +0200, Gilles Chanteperdrix wrote:
>>>>>>> On 09/16/2011 10:13 PM, Gilles Chanteperdrix wrote:
>>>>>>>> On 09/11/2011 04:29 PM, Jan Kiszka wrote:
>>>>>>>>> On 2011-09-11 16:24, Gilles Chanteperdrix wrote:
>>>>>>>>>> On 09/11/2011 12:50 PM, Jan Kiszka wrote:
>>>>>>>>>>> Hi all,
>>>>>>>>>>>
>>>>>>>>>>> just looked into the hrescnt issue again, specifically the corner 
>>>>>>>>>>> case
>>>>>>>>>>> of a shadow thread switching from real-time policy to SCHED_OTHER.
>>>>>>>>>>
>>>>>>>>>> Doing this while holding a mutex looks invalid.
>>>>>>>>>
>>>>>>>>> Looking at POSIX e.g., is there anything in the spec that makes this
>>>>>>>>> invalid? If the kernel preserves or established proper priority
>>>>>>>>> boosting, I do not see what could break in principle.
>>>>>>>>>
>>>>>>>>> It is nothing I would design into some app, but we should somehow 
>>>>>>>>> handle
>>>>>>>>> it (doc update or code adjustments).
>>>>>>>>>
>>>>>>>>>> If we do not do it, the current code is valid.
>>>>>>>>>
>>>>>>>>> Except for its dependency on XNOTHER which is not updated on 
>>>>>>>>> RT->NORMAL
>>>>>>>>> transitions.
>>>>>>>>
>>>>>>>> The fact that this update did not take place made the code work. No 
>>>>>>>> negative rescnt could happen with that code.
>>>>>>>>
>>>>>>>> Anyway, here is a patch to allow switching back from RT to NORMAL, but 
>>>>>>>> send a SIGDEBUG to a thread attempting to release a mutex while its 
>>>>>>>> counter is already 0. We end up avoiding a big chunk of code that 
>>>>>>>> would 
>>>>>>>> have been useful for a really strange corner case.
>>>>>>>>
>>>>>>>
>>>>>>> Here comes version 2:
>>>>>>> diff --git a/include/nucleus/sched-idle.h b/include/nucleus/sched-idle.h
>>>>>>> index 6399a17..417170f 100644
>>>>>>> --- a/include/nucleus/sched-idle.h
>>>>>>> +++ b/include/nucleus/sched-idle.h
>>>>>>> @@ -39,6 +39,8 @@ extern struct xnsched_class xnsched_class_idle;
>>>>>>>  static inline void __xnsched_idle_setparam(struct xnthread *thread,
>>>>>>>const union 
>>>>>>> xnsched_policy_param *p)
>>>>>>>  {
>>>>>>> +   if (xnthread_test_state(thread, XNSHADOW))
>>>>>>> +   xnthread_clear_state(thread, XNOTHER);
>>>>>>> thread->cprio = p->idle.prio;
>>>>>>>  }
>>>>>>>  
>>>>>>> diff --git a/include/nucleus/sched-rt.h b/include/nucleus/sched-rt.h
>>>>>>> index 71f655c..cc1cefa 100644
>>>>>>> --- a/include/nucleus/sched-rt.h
>>>>>>> +++ b/include/nucleus/sched-rt.h
>>>>>>> @@ -86,6 +86,12 @@ static inline void __xnsched_rt_setparam(struct 
>>>>>>> xnthread *thread,
>>>>>>>  const union 
>>>>>>> xnsched_policy_param *p)
>>>>>>>  {
>>>>>>> thread->cprio = p->rt.prio;
>>>>>>> +   if (xnthread_test_state(thread, XNSHADOW)) {
>>>>>>> +   if (thread->cprio)
>>>>>>> +   xnthread_clear_state(thread, XNOTHER);
>>>>>>> +   else
>>>>>>> +   xnthread_set_state(thread, XNOTHER);
>

Re: [Xenomai-core] Policy switching and XNOTHER maintenance

2011-09-18 Thread Jan Kiszka
On 2011-09-18 17:11, Jan Kiszka wrote:
> On 2011-09-18 17:10, Philippe Gerum wrote:
>> On Sun, 2011-09-18 at 16:34 +0200, Jan Kiszka wrote:
>>> On 2011-09-18 16:02, Philippe Gerum wrote:
>>>> On Fri, 2011-09-16 at 22:39 +0200, Gilles Chanteperdrix wrote:
>>>>> On 09/16/2011 10:13 PM, Gilles Chanteperdrix wrote:
>>>>>> On 09/11/2011 04:29 PM, Jan Kiszka wrote:
>>>>>>> On 2011-09-11 16:24, Gilles Chanteperdrix wrote:
>>>>>>>> On 09/11/2011 12:50 PM, Jan Kiszka wrote:
>>>>>>>>> Hi all,
>>>>>>>>>
>>>>>>>>> just looked into the hrescnt issue again, specifically the corner case
>>>>>>>>> of a shadow thread switching from real-time policy to SCHED_OTHER.
>>>>>>>>
>>>>>>>> Doing this while holding a mutex looks invalid.
>>>>>>>
>>>>>>> Looking at POSIX e.g., is there anything in the spec that makes this
>>>>>>> invalid? If the kernel preserves or established proper priority
>>>>>>> boosting, I do not see what could break in principle.
>>>>>>>
>>>>>>> It is nothing I would design into some app, but we should somehow handle
>>>>>>> it (doc update or code adjustments).
>>>>>>>
>>>>>>>> If we do not do it, the current code is valid.
>>>>>>>
>>>>>>> Except for its dependency on XNOTHER which is not updated on RT->NORMAL
>>>>>>> transitions.
>>>>>>
>>>>>> The fact that this update did not take place made the code work. No 
>>>>>> negative rescnt could happen with that code.
>>>>>>
>>>>>> Anyway, here is a patch to allow switching back from RT to NORMAL, but 
>>>>>> send a SIGDEBUG to a thread attempting to release a mutex while its 
>>>>>> counter is already 0. We end up avoiding a big chunk of code that would 
>>>>>> have been useful for a really strange corner case.
>>>>>>
>>>>>
>>>>> Here comes version 2:
>>>>> diff --git a/include/nucleus/sched-idle.h b/include/nucleus/sched-idle.h
>>>>> index 6399a17..417170f 100644
>>>>> --- a/include/nucleus/sched-idle.h
>>>>> +++ b/include/nucleus/sched-idle.h
>>>>> @@ -39,6 +39,8 @@ extern struct xnsched_class xnsched_class_idle;
>>>>>  static inline void __xnsched_idle_setparam(struct xnthread *thread,
>>>>>  const union xnsched_policy_param *p)
>>>>>  {
>>>>> + if (xnthread_test_state(thread, XNSHADOW))
>>>>> + xnthread_clear_state(thread, XNOTHER);
>>>>>   thread->cprio = p->idle.prio;
>>>>>  }
>>>>>  
>>>>> diff --git a/include/nucleus/sched-rt.h b/include/nucleus/sched-rt.h
>>>>> index 71f655c..cc1cefa 100644
>>>>> --- a/include/nucleus/sched-rt.h
>>>>> +++ b/include/nucleus/sched-rt.h
>>>>> @@ -86,6 +86,12 @@ static inline void __xnsched_rt_setparam(struct 
>>>>> xnthread *thread,
>>>>>const union xnsched_policy_param *p)
>>>>>  {
>>>>>   thread->cprio = p->rt.prio;
>>>>> + if (xnthread_test_state(thread, XNSHADOW)) {
>>>>> + if (thread->cprio)
>>>>> + xnthread_clear_state(thread, XNOTHER);
>>>>> + else
>>>>> + xnthread_set_state(thread, XNOTHER);
>>>>> + }
>>>>>  }
>>>>>  
>>>>>  static inline void __xnsched_rt_getparam(struct xnthread *thread,
>>>>> diff --git a/ksrc/nucleus/pod.c b/ksrc/nucleus/pod.c
>>>>> index 9a02e80..d1f 100644
>>>>> --- a/ksrc/nucleus/pod.c
>>>>> +++ b/ksrc/nucleus/pod.c
>>>>> @@ -1896,16 +1896,6 @@ int __xnpod_set_thread_schedparam(struct xnthread 
>>>>> *thread,
>>>>>   xnsched_putback(thread);
>>>>>  
>>>>>  #ifdef CONFIG_XENO_OPT_PERVASIVE
>>>>> - /*
>>>>> -  * A non-real-time shadow may upgrade to real-time FIFO
>>>>> -  * scheduling, but the latter may never downgrade to
>>>>> -  * SCHED_NORMAL Xenomai-wise. In the valid case, we clear
&g

Re: [Xenomai-core] Policy switching and XNOTHER maintenance

2011-09-18 Thread Jan Kiszka
On 2011-09-18 17:10, Philippe Gerum wrote:
> On Sun, 2011-09-18 at 16:34 +0200, Jan Kiszka wrote:
>> On 2011-09-18 16:02, Philippe Gerum wrote:
>>> On Fri, 2011-09-16 at 22:39 +0200, Gilles Chanteperdrix wrote:
>>>> On 09/16/2011 10:13 PM, Gilles Chanteperdrix wrote:
>>>>> On 09/11/2011 04:29 PM, Jan Kiszka wrote:
>>>>>> On 2011-09-11 16:24, Gilles Chanteperdrix wrote:
>>>>>>> On 09/11/2011 12:50 PM, Jan Kiszka wrote:
>>>>>>>> Hi all,
>>>>>>>>
>>>>>>>> just looked into the hrescnt issue again, specifically the corner case
>>>>>>>> of a shadow thread switching from real-time policy to SCHED_OTHER.
>>>>>>>
>>>>>>> Doing this while holding a mutex looks invalid.
>>>>>>
>>>>>> Looking at POSIX e.g., is there anything in the spec that makes this
>>>>>> invalid? If the kernel preserves or established proper priority
>>>>>> boosting, I do not see what could break in principle.
>>>>>>
>>>>>> It is nothing I would design into some app, but we should somehow handle
>>>>>> it (doc update or code adjustments).
>>>>>>
>>>>>>> If we do not do it, the current code is valid.
>>>>>>
>>>>>> Except for its dependency on XNOTHER which is not updated on RT->NORMAL
>>>>>> transitions.
>>>>>
>>>>> The fact that this update did not take place made the code work. No 
>>>>> negative rescnt could happen with that code.
>>>>>
>>>>> Anyway, here is a patch to allow switching back from RT to NORMAL, but 
>>>>> send a SIGDEBUG to a thread attempting to release a mutex while its 
>>>>> counter is already 0. We end up avoiding a big chunk of code that would 
>>>>> have been useful for a really strange corner case.
>>>>>
>>>>
>>>> Here comes version 2:
>>>> diff --git a/include/nucleus/sched-idle.h b/include/nucleus/sched-idle.h
>>>> index 6399a17..417170f 100644
>>>> --- a/include/nucleus/sched-idle.h
>>>> +++ b/include/nucleus/sched-idle.h
>>>> @@ -39,6 +39,8 @@ extern struct xnsched_class xnsched_class_idle;
>>>>  static inline void __xnsched_idle_setparam(struct xnthread *thread,
>>>>   const union xnsched_policy_param *p)
>>>>  {
>>>> +  if (xnthread_test_state(thread, XNSHADOW))
>>>> +  xnthread_clear_state(thread, XNOTHER);
>>>>thread->cprio = p->idle.prio;
>>>>  }
>>>>  
>>>> diff --git a/include/nucleus/sched-rt.h b/include/nucleus/sched-rt.h
>>>> index 71f655c..cc1cefa 100644
>>>> --- a/include/nucleus/sched-rt.h
>>>> +++ b/include/nucleus/sched-rt.h
>>>> @@ -86,6 +86,12 @@ static inline void __xnsched_rt_setparam(struct 
>>>> xnthread *thread,
>>>> const union xnsched_policy_param *p)
>>>>  {
>>>>thread->cprio = p->rt.prio;
>>>> +  if (xnthread_test_state(thread, XNSHADOW)) {
>>>> +  if (thread->cprio)
>>>> +  xnthread_clear_state(thread, XNOTHER);
>>>> +  else
>>>> +  xnthread_set_state(thread, XNOTHER);
>>>> +  }
>>>>  }
>>>>  
>>>>  static inline void __xnsched_rt_getparam(struct xnthread *thread,
>>>> diff --git a/ksrc/nucleus/pod.c b/ksrc/nucleus/pod.c
>>>> index 9a02e80..d1f 100644
>>>> --- a/ksrc/nucleus/pod.c
>>>> +++ b/ksrc/nucleus/pod.c
>>>> @@ -1896,16 +1896,6 @@ int __xnpod_set_thread_schedparam(struct xnthread 
>>>> *thread,
>>>>xnsched_putback(thread);
>>>>  
>>>>  #ifdef CONFIG_XENO_OPT_PERVASIVE
>>>> -  /*
>>>> -   * A non-real-time shadow may upgrade to real-time FIFO
>>>> -   * scheduling, but the latter may never downgrade to
>>>> -   * SCHED_NORMAL Xenomai-wise. In the valid case, we clear
>>>> -   * XNOTHER to reflect the change. Note that we keep handling
>>>> -   * non real-time shadow specifics in higher code layers, not
>>>> -   * to pollute the core scheduler with peculiarities.
>>>> -   */
>>>> -  if (sched_class == &xnsched_class_rt && sched_p

Re: [Xenomai-core] Policy switching and XNOTHER maintenance

2011-09-18 Thread Jan Kiszka
On 2011-09-18 16:02, Philippe Gerum wrote:
> On Fri, 2011-09-16 at 22:39 +0200, Gilles Chanteperdrix wrote:
>> On 09/16/2011 10:13 PM, Gilles Chanteperdrix wrote:
>>> On 09/11/2011 04:29 PM, Jan Kiszka wrote:
>>>> On 2011-09-11 16:24, Gilles Chanteperdrix wrote:
>>>>> On 09/11/2011 12:50 PM, Jan Kiszka wrote:
>>>>>> Hi all,
>>>>>>
>>>>>> just looked into the hrescnt issue again, specifically the corner case
>>>>>> of a shadow thread switching from real-time policy to SCHED_OTHER.
>>>>>
>>>>> Doing this while holding a mutex looks invalid.
>>>>
>>>> Looking at POSIX e.g., is there anything in the spec that makes this
>>>> invalid? If the kernel preserves or established proper priority
>>>> boosting, I do not see what could break in principle.
>>>>
>>>> It is nothing I would design into some app, but we should somehow handle
>>>> it (doc update or code adjustments).
>>>>
>>>>> If we do not do it, the current code is valid.
>>>>
>>>> Except for its dependency on XNOTHER which is not updated on RT->NORMAL
>>>> transitions.
>>>
>>> The fact that this update did not take place made the code work. No 
>>> negative rescnt could happen with that code.
>>>
>>> Anyway, here is a patch to allow switching back from RT to NORMAL, but 
>>> send a SIGDEBUG to a thread attempting to release a mutex while its 
>>> counter is already 0. We end up avoiding a big chunk of code that would 
>>> have been useful for a really strange corner case.
>>>
>>
>> Here comes version 2:
>> diff --git a/include/nucleus/sched-idle.h b/include/nucleus/sched-idle.h
>> index 6399a17..417170f 100644
>> --- a/include/nucleus/sched-idle.h
>> +++ b/include/nucleus/sched-idle.h
>> @@ -39,6 +39,8 @@ extern struct xnsched_class xnsched_class_idle;
>>  static inline void __xnsched_idle_setparam(struct xnthread *thread,
>> const union xnsched_policy_param *p)
>>  {
>> +if (xnthread_test_state(thread, XNSHADOW))
>> +xnthread_clear_state(thread, XNOTHER);
>>  thread->cprio = p->idle.prio;
>>  }
>>  
>> diff --git a/include/nucleus/sched-rt.h b/include/nucleus/sched-rt.h
>> index 71f655c..cc1cefa 100644
>> --- a/include/nucleus/sched-rt.h
>> +++ b/include/nucleus/sched-rt.h
>> @@ -86,6 +86,12 @@ static inline void __xnsched_rt_setparam(struct xnthread 
>> *thread,
>>   const union xnsched_policy_param *p)
>>  {
>>  thread->cprio = p->rt.prio;
>> +if (xnthread_test_state(thread, XNSHADOW)) {
>> +if (thread->cprio)
>> +xnthread_clear_state(thread, XNOTHER);
>> +else
>> +xnthread_set_state(thread, XNOTHER);
>> +}
>>  }
>>  
>>  static inline void __xnsched_rt_getparam(struct xnthread *thread,
>> diff --git a/ksrc/nucleus/pod.c b/ksrc/nucleus/pod.c
>> index 9a02e80..d1f 100644
>> --- a/ksrc/nucleus/pod.c
>> +++ b/ksrc/nucleus/pod.c
>> @@ -1896,16 +1896,6 @@ int __xnpod_set_thread_schedparam(struct xnthread 
>> *thread,
>>  xnsched_putback(thread);
>>  
>>  #ifdef CONFIG_XENO_OPT_PERVASIVE
>> -/*
>> - * A non-real-time shadow may upgrade to real-time FIFO
>> - * scheduling, but the latter may never downgrade to
>> - * SCHED_NORMAL Xenomai-wise. In the valid case, we clear
>> - * XNOTHER to reflect the change. Note that we keep handling
>> - * non real-time shadow specifics in higher code layers, not
>> - * to pollute the core scheduler with peculiarities.
>> - */
>> -if (sched_class == &xnsched_class_rt && sched_param->rt.prio > 0)
>> -xnthread_clear_state(thread, XNOTHER);
>>  if (propagate) {
>>  if (xnthread_test_state(thread, XNRELAX))
>>  xnshadow_renice(thread);
>> diff --git a/ksrc/nucleus/sched-sporadic.c b/ksrc/nucleus/sched-sporadic.c
>> index fd37c21..ffc9bab 100644
>> --- a/ksrc/nucleus/sched-sporadic.c
>> +++ b/ksrc/nucleus/sched-sporadic.c
>> @@ -258,6 +258,8 @@ static void xnsched_sporadic_setparam(struct xnthread 
>> *thread,
>>  }
>>  }
>>  
>> +if (xnthread_test_state(thread, XNSHADOW))
>> +xnthread_clear_state(thread, XNOTHER);
>>  

Re: [Xenomai-core] Policy switching and XNOTHER maintenance

2011-09-11 Thread Jan Kiszka
On 2011-09-11 16:24, Gilles Chanteperdrix wrote:
> On 09/11/2011 12:50 PM, Jan Kiszka wrote:
>> Hi all,
>>
>> just looked into the hrescnt issue again, specifically the corner case
>> of a shadow thread switching from real-time policy to SCHED_OTHER.
> 
> Doing this while holding a mutex looks invalid.

Looking at POSIX e.g., is there anything in the spec that makes this
invalid? If the kernel preserves or established proper priority
boosting, I do not see what could break in principle.

It is nothing I would design into some app, but we should somehow handle
it (doc update or code adjustments).

> If we do not do it, the current code is valid.

Except for its dependency on XNOTHER which is not updated on RT->NORMAL
transitions.

Jan




signature.asc
Description: OpenPGP digital signature
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] Policy switching and XNOTHER maintenance

2011-09-11 Thread Jan Kiszka
On 2011-09-11 12:50, Jan Kiszka wrote:
> Hi all,
> 
> just looked into the hrescnt issue again, specifically the corner case
> of a shadow thread switching from real-time policy to SCHED_OTHER.
> 
> Looks like we don't support this at all ATM:
> 
> do_setsched_event ignores events that enabled SCHED_OTHER, and the
> XNOTHER flag is only set on xnshadow_map, not updated anywhere else.
> Fixing this is not straightforward as that flag resides in a state
> variable that is owned by the thread (ie. updated non-atomically) while
> do_setsched_event can also be called over different contexts.

OK, it just takes nklock to update the thread state.

> 
> Or am I missing something?

From __xnpod_set_thread_schedparam:

"A non-real-time shadow may upgrade to real-time FIFO
scheduling, but the latter may never downgrade to
SCHED_NORMAL Xenomai-wise. In the valid case, we clear
XNOTHER to reflect the change."

What prevents us from setting XNOTHER if the priority drops to 0, i.e.
the Linux task re-enters SCHED_NORMAL (while Xenomai will still schedule
it via its RT class of course)? We cannot deny such a transition on the
Linux side anyway, can we?

Jan




signature.asc
Description: OpenPGP digital signature
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


[Xenomai-core] Policy switching and XNOTHER maintenance

2011-09-11 Thread Jan Kiszka
Hi all,

just looked into the hrescnt issue again, specifically the corner case
of a shadow thread switching from real-time policy to SCHED_OTHER.

Looks like we don't support this at all ATM:

do_setsched_event ignores events that enabled SCHED_OTHER, and the
XNOTHER flag is only set on xnshadow_map, not updated anywhere else.
Fixing this is not straightforward as that flag resides in a state
variable that is owned by the thread (ie. updated non-atomically) while
do_setsched_event can also be called over different contexts.

Or am I missing something?

In the light of this, I would vote for reverting f9bfab3457 ("only
update rescnt for XNOTHER threads") as it only adds conditional branches
to the hot paths without solving the issue.

Jan



signature.asc
Description: OpenPGP digital signature
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] xenomai-core ftrace

2011-09-04 Thread Jan Kiszka
On 2011-09-05 04:32, rainbow wrote:
> hi,
> I did a test for ftrace-xenomai on x86_64 test just now:
> the environment is : OS: debian 6.0 gcc-4.1
> I use origin/queues/2.6.37-x86 branch on git
> git://git.kiszka.org/ipipe-2.6for linux kernel and use branch
> origin/queues/ftrace on git git://
> git.xenomai.org/xenomai-jki.git
> then I do as the following procedures:
> 1,cd xenomai git
> 2, ./scripts/prepare-kernel.sh --arch=x86_64 --linux=path_to_2.6.37-x86
> 3, cd path_to_2.6.37_x86
> 4,make menuconfig
> then enble ftrace on kernel hacking option.
> 5,install the kernel and reboot the machine.
> 6,do ftrace test
> 6.1 mount -t debugfs nodev /sys/kernel/debug
> 6.2 cd /sys/kernel/debug/tracing
> 6.3 echo function > current_tracer
> At last the machine reboot.
> Is there something wrong about the above procedure? Thanks!

Only queues/2.6.35-x86-trace contains ftrace support.

Jan



signature.asc
Description: OpenPGP digital signature
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] xenomai-core ftrace

2011-09-04 Thread Jan Kiszka
On 2011-09-04 15:16, rainbow wrote:
> 2011/9/4 Jan Kiszka 
> 
>> On 2011-09-04 14:21, rainbow wrote:
>>> Is the ipipe patch the same as patch like
>>> adeos-ipipe-2.6.37.6-x86-2.9-02.patch,
>>
>> Except that the trace branch is for 2.6.35, yes. More precisely it is
>> now the same, I just pushed the latest version that includes two more
>> backported ipipe fixes.
>>
>>> I know the latter is xenomai patch
>>> and after I patch it, I can see "Real-time sub-system  ---> " Option. But
>> If
>>> I use 2.6.35-x86-trace which contains ,there is no such option.
>>
>> That menu option is introduced by Xenomai, ie. after running
>> prepare-kernel.sh. You likely forgot that step.
>>
> 
> 
>> Yes,I forget the step. So I think I only have to run prepare-kernel.sh
>> --arch=x86_64
> 
> --linux=2.6.35-x86-trace  , I do not need --adeos option because
>> the 2.6.35-x86-trace contains the ipipe patch.
>>
> 
> 
> 
>> Note again that you have to use a Xenomai tree with the required ftrace
>> patches on top if you want Xenomai to generate ftrace events as well.
>>
>> "Xenomai tree with required ftrace patches on top" you mean the branch
> "remotes/origin/queues/ftrace"?

Yep. I just pushed a rebased version of current git master.

Jan



signature.asc
Description: OpenPGP digital signature
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] xenomai-core ftrace

2011-09-04 Thread Jan Kiszka
On 2011-09-04 14:21, rainbow wrote:
> Is the ipipe patch the same as patch like
> adeos-ipipe-2.6.37.6-x86-2.9-02.patch,

Except that the trace branch is for 2.6.35, yes. More precisely it is
now the same, I just pushed the latest version that includes two more
backported ipipe fixes.

> I know the latter is xenomai patch
> and after I patch it, I can see "Real-time sub-system  ---> " Option. But If
> I use 2.6.35-x86-trace which contains ,there is no such option.

That menu option is introduced by Xenomai, ie. after running
prepare-kernel.sh. You likely forgot that step.

Note again that you have to use a Xenomai tree with the required ftrace
patches on top if you want Xenomai to generate ftrace events as well.

> Another  problem is that there are so many xenomai gits , how can i download
> the correct git?

By cloning the the git repository you obtain all available branches. You
just need to checkout the desired one afterward.

> I am a newby to xenomai and I am sorry to ask so many questions but I want
> to do something on xenomai :) . Thank you for your detail answers.

Setting up ftrace for Xenomai is not necessarily a newbie task, but I
think I know the background of this. :)

Jan



signature.asc
Description: OpenPGP digital signature
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] xenomai-core ftrace

2011-09-04 Thread Jan Kiszka
On 2011-09-04 13:49, rainbow wrote:
> you mean I use remotes/origin/queues/2.6.37-x86 branch and use the ipipe
> patch for 2.6.37 then install them on x86_64, the ftrace can work?I will
> have a try, thank you!

Use the 2.6.35-x86-trace, it already contains the ipipe patch, and build
it for x86-64.

Jan



signature.asc
Description: OpenPGP digital signature
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] xenomai-core ftrace

2011-09-04 Thread Jan Kiszka
On 2011-09-04 07:10, rainbow wrote:
> Sorry to reply so late, I did a test about install ftrace on xenomai. the
> following is my procedure:
> #git://git.xenomai.org/xenomai-jki.git queues/ftrace
> #git://git.kiszka.org/ipipe-2.6 queues/2.6.35-x86-trace
> #cd queues/ftrace
> #git checkout -b remotes/origin/queues/ftrace
>  origin/queues/2.6.35-x86-trace  //change to the ftrace xenomai branch
> #cd ../2.6.35-x86-trace
> #git checkout
> -b origin/queues/2.6.35-x86-trace origin/queues/2.6.35-x86-trace
> #cd ../ftrace
> #./scripts/prepare-kernel.sh  --arch=i386
> --adeos=ksrc/arch/x86/patches/adeos-ipipe-2.6.35.9-x86-2.8-04.patch
> --linux=../2.6.35-x86-trace/
> #cd /2.6.35-x86-trace/
> 
> then I compile the kernel but I get the following error message:
> arch/x86/kernel/ipipe.c:851: error: conflicting types for ‘update_vsyscall’
> include/linux/clocksource.h:316: note: previous declaration of
> ‘update_vsyscall’ was here
> make[2]: *** [arch/x86/kernel/ipipe.o] Error 1
> make[1]: *** [arch/x86/kernel] Error 2
> make: *** [arch/x86] Error 2

That's a build issues of the underlying old ipipe patch. However, it's
x86-32 only. And as the documentation stated, only x86-64 is supported
by the ftrace patches. So build for 64 bit instead.

Jan



signature.asc
Description: OpenPGP digital signature
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] xenomai-core ftrace

2011-09-03 Thread Jan Kiszka
On 2011-09-03 04:52, rainbow wrote:
> hi,all,I want to use ftrace in xenomai-2.5.6,but when I use git://
> git.kiszka.org/ipipe.git queues/2.6.35-x86-trace to get the linux
> kernel,there is no option about xenomai or ipipe . If I want to patch the
> xenomai patch,there are some problem. How should I use ftrace on
> xenomai?Thanks!

First of all, make sure to read README.INSTALL in the Xenomai tree for
the basic installation procedure.

That git branch above replaces the installation step of picking a
vanilla Linux source tree and applying the ipipe patch to it (if there
is no ipipe option in the kernel config, you probably haven't check out
the right branch yet).

The next step would be running Xenomai's prepare-kernel.sh, in this case
using a Xenomai tree that has the required ftrace patches, see

http://permalink.gmane.org/gmane.linux.real-time.xenomai.devel/7966

Jan



signature.asc
Description: OpenPGP digital signature
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] Xenomai 2.6.0, or -rc1?

2011-08-26 Thread Jan Kiszka
On 2011-08-26 20:07, Gilles Chanteperdrix wrote:
> On 08/26/2011 03:05 PM, Jan Kiszka wrote:
>> On 2011-08-26 14:34, Gilles Chanteperdrix wrote:
>>>
>>> Hi,
>>>
>>> I think it is about time we release Xenomai 2.6.0. Has anyone anything
>>> pending (maybe Alex)? Should we release an -rc first?
>>
>> No patches ATM, but [1] is still an open bug - a bug that affects the ABI.
>>
>> Jan
>>
>> [1] http://thread.gmane.org/gmane.linux.real-time.xenomai.devel/8343
>>
> 
> I had forgotten about this one. So, the only real problem is if a
> SCHED_NOTOTHER thread switches to SCHED_OTHER, this appears to be a
> corner case, so, I wonder if you should not simply add a special
> treatment, only for this corner case.
> 
> What I have in mind is keeping a list of xnsynch in kernel-space (this
> basically means having an xnholder_t more in the xnsynch structure), and
> when we trip the corner case (thread with SCHED_FIFO switches to
> SCHED_OTHER), walk the list to find how many xnsynch the thread is the
> owner, we have that info in kernel-space, and set the refcnt accordingly.
> 
> Or does it still sound overkill?
> 

Mmh, need to think about it. Yeah, we do not support
PTHREAD_MUTEX_INITIALIZER, so we do not share that part of the problem
with futexes.

If we have all objects and can explore ownership, we can also implement
robust mutexes this way, i.e. waiter signaling when the owner dies.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] Xenomai 2.6.0, or -rc1?

2011-08-26 Thread Jan Kiszka
On 2011-08-26 14:34, Gilles Chanteperdrix wrote:
> 
> Hi,
> 
> I think it is about time we release Xenomai 2.6.0. Has anyone anything
> pending (maybe Alex)? Should we release an -rc first?

No patches ATM, but [1] is still an open bug - a bug that affects the ABI.

Jan

[1] http://thread.gmane.org/gmane.linux.real-time.xenomai.devel/8343

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : rt_print: Provide rt_puts

2011-07-31 Thread Jan Kiszka
On 2011-07-31 19:46, Gilles Chanteperdrix wrote:
> On 07/31/2011 07:42 PM, Jan Kiszka wrote:
>> On 2011-07-31 19:21, Gilles Chanteperdrix wrote:
>>> On 07/31/2011 06:49 PM, GIT version control wrote:
>>>> +int rt_puts(const char *s)
>>>> +{
>>>> +  return print_to_buffer(stdout, 0, RT_PRINT_MODE_PUTS, s, NULL);
>>>> +}
>>>
>>> gcc for ARM chokes here: it says that NULL can not be converted to a
>>> va_list, however I try it.
>>
>> Hmm. Does this work?
>>
>> diff --git a/src/skins/common/rt_print.c b/src/skins/common/rt_print.c
>> index 186de48..52538d8 100644
>> --- a/src/skins/common/rt_print.c
>> +++ b/src/skins/common/rt_print.c
>> @@ -243,7 +243,9 @@ int rt_printf(const char *format, ...)
>>
>>  int rt_puts(const char *s)
>>  {
>> -return print_to_buffer(stdout, 0, RT_PRINT_MODE_PUTS, s, NULL);
>> +va_list dummy;
>> +
>> +return print_to_buffer(stdout, 0, RT_PRINT_MODE_PUTS, s, dummy);
>>  }
>>
>>  void rt_syslog(int priority, const char *format, ...)
>>
>>
>> Not really beautiful as well, I know.
> 
> It seems to work now, but some later version of gcc may decide to warn
> us that this argument is used without being initialized...

Yes. I've pushed a cleaner version.

Jan



signature.asc
Description: OpenPGP digital signature
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : rt_print: Provide rt_puts

2011-07-31 Thread Jan Kiszka
On 2011-07-31 19:21, Gilles Chanteperdrix wrote:
> On 07/31/2011 06:49 PM, GIT version control wrote:
>> +int rt_puts(const char *s)
>> +{
>> +return print_to_buffer(stdout, 0, RT_PRINT_MODE_PUTS, s, NULL);
>> +}
> 
> gcc for ARM chokes here: it says that NULL can not be converted to a
> va_list, however I try it.

Hmm. Does this work?

diff --git a/src/skins/common/rt_print.c b/src/skins/common/rt_print.c
index 186de48..52538d8 100644
--- a/src/skins/common/rt_print.c
+++ b/src/skins/common/rt_print.c
@@ -243,7 +243,9 @@ int rt_printf(const char *format, ...)

 int rt_puts(const char *s)
 {
-   return print_to_buffer(stdout, 0, RT_PRINT_MODE_PUTS, s, NULL);
+   va_list dummy;
+
+   return print_to_buffer(stdout, 0, RT_PRINT_MODE_PUTS, s, dummy);
 }

 void rt_syslog(int priority, const char *format, ...)


Not really beautiful as well, I know.

Jan



signature.asc
Description: OpenPGP digital signature
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] __wrap_clock_gettime

2011-07-29 Thread Jan Kiszka
On 2011-07-28 18:56, Jan Kiszka wrote:
> On 2011-07-27 20:49, Gilles Chanteperdrix wrote:
>> On 07/22/2011 05:04 PM, Jan Kiszka wrote:
>>> Hi Gilles,
>>>
>>> pulling assert_context.c into the common libxenomai created a problem
>>> around picking the right __wrap_clock_gettime. As libpthread_rt depends
>>> on libxenomai, the latter is loaded first and defines the debug version
>>> of __wrap_clock_gettime as the default. There is no chance to pull the
>>> posix skin implementation.
>>>
>>> I don't have a nice idea yes how to resolve this. Options are:
>>>  - drop __wrap_clock_gettime from assert_context.c (affects any skin
>>>user != posix)
>>>  - put assert_context stuff into separate library again
>>>  - put __wrap_clock_gettime in all skin libs != posix
>>>
>>> I'm favoring the simplest approach ATM, ie. the first one. Other ideas?
>>
>> I agree, but I would have thought __attriibute__((weak)) takes care of
>> this issue.
> 
> The point is that once you have pulled in that weak symbol into a
> process, the dynamic loader won't update it if a non-weak version is
> pulled in via dlopen.

Fiddling with all options again, it turned out that the issue is also
reproducible without any dlopen. This ordering is fine:

-lpthread_rt -lxenomai

This one does not work:

-lxenomai -lpthread_rt

Now the problematic ordering can be achieved in many ways, in our case
by linking a top-level process first against a library with
native+xenomai dependencies and then against one with pthread_rt+xenomai
deps. That is resolvable by reordering, but only if you maintain this
meta-information consistently (IOW, it's doomed to break accidentally
again).

So I've pushed a patch that replaces the code with a comment that
explains the non-existence.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] __wrap_clock_gettime

2011-07-28 Thread Jan Kiszka
On 2011-07-28 18:56, Jan Kiszka wrote:
> On 2011-07-27 20:49, Gilles Chanteperdrix wrote:
>> On 07/22/2011 05:04 PM, Jan Kiszka wrote:
>>> Hi Gilles,
>>>
>>> pulling assert_context.c into the common libxenomai created a problem
>>> around picking the right __wrap_clock_gettime. As libpthread_rt depends
>>> on libxenomai, the latter is loaded first and defines the debug version
>>> of __wrap_clock_gettime as the default. There is no chance to pull the
>>> posix skin implementation.
>>>
>>> I don't have a nice idea yes how to resolve this. Options are:
>>>  - drop __wrap_clock_gettime from assert_context.c (affects any skin
>>>user != posix)
>>>  - put assert_context stuff into separate library again
>>>  - put __wrap_clock_gettime in all skin libs != posix
>>>
>>> I'm favoring the simplest approach ATM, ie. the first one. Other ideas?
>>
>> I agree, but I would have thought __attriibute__((weak)) takes care of
>> this issue.
> 
> The point is that once you have pulled in that weak symbol into a
> process, the dynamic loader won't update it if a non-weak version is
> pulled in via dlopen.

Mmh, maybe we have a fourth option: disabling the weak version if
CONFIG_XENO_LIBS_DLOPEN is set. OTOH, the correlation may not be obvious
to users. So that result could be even more confusing.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] __wrap_clock_gettime

2011-07-28 Thread Jan Kiszka
On 2011-07-27 20:49, Gilles Chanteperdrix wrote:
> On 07/22/2011 05:04 PM, Jan Kiszka wrote:
>> Hi Gilles,
>>
>> pulling assert_context.c into the common libxenomai created a problem
>> around picking the right __wrap_clock_gettime. As libpthread_rt depends
>> on libxenomai, the latter is loaded first and defines the debug version
>> of __wrap_clock_gettime as the default. There is no chance to pull the
>> posix skin implementation.
>>
>> I don't have a nice idea yes how to resolve this. Options are:
>>  - drop __wrap_clock_gettime from assert_context.c (affects any skin
>>user != posix)
>>  - put assert_context stuff into separate library again
>>  - put __wrap_clock_gettime in all skin libs != posix
>>
>> I'm favoring the simplest approach ATM, ie. the first one. Other ideas?
> 
> I agree, but I would have thought __attriibute__((weak)) takes care of
> this issue.

The point is that once you have pulled in that weak symbol into a
process, the dynamic loader won't update it if a non-weak version is
pulled in via dlopen.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] [RFC] Fixes for domain migration races

2011-07-28 Thread Jan Kiszka
On 2011-07-27 20:44, Gilles Chanteperdrix wrote:
> On 07/19/2011 08:44 AM, Jan Kiszka wrote:
>> Hi,
>>
>> I've just uploaded my upstream queue that mostly deals with the various
>> races I found in the domain migration code.
>>
>> One of my concerns raised earlier turned out to be for no reason: We do
>> not allow Linux to wake up a task that has TASK_ATOMICSWITCH set. So the
>> deletion race can indeed be fixed by the patch I sent earlier.
> 
> So, I still have the same question: is not the solution of synchronizing
> with the gatekeeper as soon as we get out from schedule in secondary
> mode better than waiting the task_exit callback? It looks more correct,
> and it avoids gksched.

Yes, I was on the wrong track /wrt wakeup races during the early
migration phase.

It is a possible and valid scenario that the task returns from
schedule() without being migrated. That can only happen if a signal was
queued in the meantime. The task will not be woken up again, that is
prevented by ATOMICSWITCH, but it will check for pending signals itself
before falling asleep. In that case it will enter TASK_RUNNING again and
return either before the gatekeeper could run or, on SMP, may continue
in parallel on a different CPU.

What saves us now from the fatal scenario that both the task runs and
the gatekeeper resumes its Xenomai part is that TASK_INTERRUPTIBLE state
was left. And if we wait for the gatekeeper to realize this like you
suggested, we ensure that neither the object is deleted too early nor
TASK_INTERRUPTIBLE is reentered again by doing Linux work.

I've cleaned up my queue correspondingly and just pushed it.

Thanks,
Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] Debugging Xenomai kernel

2011-07-27 Thread Jan Kiszka
On 2011-07-27 18:13, zenati wrote:
> On 07/27/2011 02:12 PM, Jan Kiszka wrote:
>> On 2011-07-27 14:01, zenati wrote:
>>> On 07/27/2011 11:23 AM, Jan Kiszka wrote:
>>>> On 2011-07-26 13:54, zenati wrote:
>>>>> Dear,
>>>>>
>>>>> I'm trying to debug Xenomai kernel with GDB and Qemu.
>>>> QEMU in emulation or KVM mode?
>>>>
>>> I'm using Qemu in emulation mode.
>> What's your target arch? x84-64, x86-32, or some ARM or PPC etc.
> x86-32
>>>>> So, I launch
>>>>> Xenomai in Qemu and make it waiting for GDB. Then I start GDB and
>>>>> connect them to Qemu, make breakpoint at the desired function to debug
>>>>> and start Xenomai. If the function belongs to the linux kernel, the
>>>>> execution stop at the breakpoint. But, if the function belongs to the
>>>>> Xenomai nucleus, GDB recognize the breakpoint at the execution but don't
>>>>> stop.
>>>> Is Xenomai built as module or part of the main kernel image? The latter
>>>> is simpler to handle as you don't have to deal with modules as symbol
>>>> sources, find out their addresses, and load them at the right location.
>>>>
>>> Xenomai is built as a part of the main kernel.
>> OK. Check during runtime of the guest if the disassembly at the desired
>> breakpoint address matches the compiled code. Maybe there is some
>> accidental discrepancy.
> The breakpoints match correctly the compiled code. Anyway, it must stop 
> at the breakpoint. No?

...if the guest actually executes that address.

I haven't used x86 QEMU in emulation mode for quite a while to debug
something. Maybe we have a regression there. What's your QEMU version?
Do you have a chance to test with a KVM host? It's faster anyway.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] Debugging Xenomai kernel

2011-07-27 Thread Jan Kiszka
On 2011-07-27 14:01, zenati wrote:
> On 07/27/2011 11:23 AM, Jan Kiszka wrote:
>> On 2011-07-26 13:54, zenati wrote:
>>> Dear,
>>>
>>> I'm trying to debug Xenomai kernel with GDB and Qemu.
>> QEMU in emulation or KVM mode?
>>
> I'm using Qemu in emulation mode.

What's your target arch? x84-64, x86-32, or some ARM or PPC etc.?

>>> So, I launch
>>> Xenomai in Qemu and make it waiting for GDB. Then I start GDB and
>>> connect them to Qemu, make breakpoint at the desired function to debug
>>> and start Xenomai. If the function belongs to the linux kernel, the
>>> execution stop at the breakpoint. But, if the function belongs to the
>>> Xenomai nucleus, GDB recognize the breakpoint at the execution but don't
>>> stop.
>> Is Xenomai built as module or part of the main kernel image? The latter
>> is simpler to handle as you don't have to deal with modules as symbol
>> sources, find out their addresses, and load them at the right location.
>>
> Xenomai is built as a part of the main kernel.

OK. Check during runtime of the guest if the disassembly at the desired
breakpoint address matches the compiled code. Maybe there is some
accidental discrepancy.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] Debugging Xenomai kernel

2011-07-27 Thread Jan Kiszka
On 2011-07-26 13:54, zenati wrote:
> Dear,
> 
> I'm trying to debug Xenomai kernel with GDB and Qemu.

QEMU in emulation or KVM mode?

> So, I launch
> Xenomai in Qemu and make it waiting for GDB. Then I start GDB and
> connect them to Qemu, make breakpoint at the desired function to debug
> and start Xenomai. If the function belongs to the linux kernel, the
> execution stop at the breakpoint. But, if the function belongs to the
> Xenomai nucleus, GDB recognize the breakpoint at the execution but don't
> stop.

Is Xenomai built as module or part of the main kernel image? The latter
is simpler to handle as you don't have to deal with modules as symbol
sources, find out their addresses, and load them at the right location.

> 
> Is it possible to debug Xenomai with GDB and Qemu?

Definitely.

> Is it the good way to debug?
> 
> Thank you for your attention and your help.
> Sincerely,
> 
> Omar ZENATI

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


[Xenomai-core] __wrap_clock_gettime

2011-07-22 Thread Jan Kiszka
Hi Gilles,

pulling assert_context.c into the common libxenomai created a problem
around picking the right __wrap_clock_gettime. As libpthread_rt depends
on libxenomai, the latter is loaded first and defines the debug version
of __wrap_clock_gettime as the default. There is no chance to pull the
posix skin implementation.

I don't have a nice idea yes how to resolve this. Options are:
 - drop __wrap_clock_gettime from assert_context.c (affects any skin
   user != posix)
 - put assert_context stuff into separate library again
 - put __wrap_clock_gettime in all skin libs != posix

I'm favoring the simplest approach ATM, ie. the first one. Other ideas?

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


[Xenomai-core] [RFC] Fixes for domain migration races

2011-07-18 Thread Jan Kiszka
Hi,

I've just uploaded my upstream queue that mostly deals with the various
races I found in the domain migration code.

One of my concerns raised earlier turned out to be for no reason: We do
not allow Linux to wake up a task that has TASK_ATOMICSWITCH set. So the
deletion race can indeed be fixed by the patch I sent earlier. However,
we do not synchronize setting and testing of TASK_ATOMICSWITCH (because
we cannot hold the rq lock), thus we still face a small race window that
allows premature wakeups, at least in theory. That's now addressed by
patch 3.

Besides another race around set/clear_task_nowakeup, there should have
been a window during early migration to RT where we silently swallowed
Linux signals. Closed by patch 4, hopefully also fixing our spurious gdb
lockups on SMP boxes - time will tell.

Please review carefully.

Jan



signature.asc
Description: OpenPGP digital signature
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


[Xenomai-core] [RFC] Waitqueue-free gatekeeper wakeup

2011-07-18 Thread Jan Kiszka
Hi Philippe,

trying to decouple the PREEMPT-RT gatekeeper wakeup path from XNATOMIC
(to fix the remaining races there), I wondered why we need a waitqueue
here at all.

What about an approach like below, i.e. waking up the gatekeeper
directly via wake_up_process? That could even be called from interrupt
context. We should be able to avoid missing a wakeup by setting the task
state to INTERRUPTIBLE before signaling the semaphore.

Am I missing something?

Jan


diff --git a/include/nucleus/sched.h b/include/nucleus/sched.h
index e251329..df8853b 100644
--- a/include/nucleus/sched.h
+++ b/include/nucleus/sched.h
@@ -111,7 +111,6 @@ typedef struct xnsched {
 
 #ifdef CONFIG_XENO_OPT_PERVASIVE
struct task_struct *gatekeeper;
-   wait_queue_head_t gkwaitq;
struct semaphore gksync;
struct xnthread *gktarget;
 #endif
diff --git a/ksrc/nucleus/shadow.c b/ksrc/nucleus/shadow.c
index f6b1e16..238317a 100644
--- a/ksrc/nucleus/shadow.c
+++ b/ksrc/nucleus/shadow.c
@@ -92,7 +92,6 @@ static struct __lostagerq {
 #define LO_SIGGRP_REQ 2
 #define LO_SIGTHR_REQ 3
 #define LO_UNMAP_REQ  4
-#define LO_GKWAKE_REQ 5
int type;
struct task_struct *task;
int arg;
@@ -759,9 +758,6 @@ static void lostage_handler(void *cookie)
int cpu, reqnum, type, arg, sig, sigarg;
struct __lostagerq *rq;
struct task_struct *p;
-#ifdef CONFIG_PREEMPT_RT
-   struct xnsched *sched;
-#endif
 
cpu = smp_processor_id();
rq = &lostagerq[cpu];
@@ -819,13 +815,6 @@ static void lostage_handler(void *cookie)
case LO_SIGGRP_REQ:
kill_proc(p->pid, arg, 1);
break;
-
-#ifdef CONFIG_PREEMPT_RT
-   case LO_GKWAKE_REQ:
-   sched = xnpod_sched_slot(cpu);
-   wake_up_interruptible_sync(&sched->gkwaitq);
-   break;
-#endif
}
}
 }
@@ -873,7 +862,6 @@ static inline int normalize_priority(int prio)
 static int gatekeeper_thread(void *data)
 {
struct task_struct *this_task = current;
-   DECLARE_WAITQUEUE(wait, this_task);
int cpu = (long)data;
struct xnsched *sched = xnpod_sched_slot(cpu);
struct xnthread *target;
@@ -886,12 +874,10 @@ static int gatekeeper_thread(void *data)
set_cpus_allowed(this_task, cpumask);
set_linux_task_priority(this_task, MAX_RT_PRIO - 1);
 
-   init_waitqueue_head(&sched->gkwaitq);
-   add_wait_queue_exclusive(&sched->gkwaitq, &wait);
+   set_current_state(TASK_INTERRUPTIBLE);
up(&sched->gksync); /* Sync with xnshadow_mount(). */
 
for (;;) {
-   set_current_state(TASK_INTERRUPTIBLE);
up(&sched->gksync); /* Make the request token available. */
schedule();
 
@@ -937,6 +923,7 @@ static int gatekeeper_thread(void *data)
xnlock_put_irqrestore(&nklock, s);
xnpod_schedule();
}
+   set_current_state(TASK_INTERRUPTIBLE);
}
 
return 0;
@@ -1014,23 +1001,9 @@ redo:
thread->gksched = sched;
xnthread_set_info(thread, XNATOMIC);
set_current_state(TASK_INTERRUPTIBLE | TASK_ATOMICSWITCH);
-#ifndef CONFIG_PREEMPT_RT
-   /*
-* We may not hold the preemption lock across calls to
-* wake_up_*() services over fully preemptible kernels, since
-* tasks might sleep when contending for spinlocks. The wake
-* up call for the gatekeeper will happen later, over an APC
-* we kick in do_schedule_event() on the way out for the
-* hardening task.
-*
-* We could delay the wake up call over non-RT 2.6 kernels as
-* well, but not when running over 2.4 (scheduler innards
-* would not allow this, causing weirdnesses when hardening
-* tasks). So we always do the early wake up when running
-* non-RT, which includes 2.4.
-*/
-   wake_up_interruptible_sync(&sched->gkwaitq);
-#endif
+
+   wake_up_process(sched->gatekeeper);
+
schedule();
 
/*

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix race between gatekeeper and thread deletion

2011-07-16 Thread Jan Kiszka
On 2011-07-16 11:56, Philippe Gerum wrote:
> On Sat, 2011-07-16 at 11:15 +0200, Jan Kiszka wrote:
>> On 2011-07-16 10:52, Philippe Gerum wrote:
>>> On Sat, 2011-07-16 at 10:13 +0200, Jan Kiszka wrote:
>>>> On 2011-07-15 15:10, Jan Kiszka wrote:
>>>>> But... right now it looks like we found our primary regression:
>>>>> "nucleus/shadow: shorten the uninterruptible path to secondary mode".
>>>>> It opens a short windows during relax where the migrated task may be
>>>>> active under both schedulers. We are currently evaluating a revert
>>>>> (looks good so far), and I need to work out my theory in more
>>>>> details.
>>>>
>>>> Looks like this commit just made a long-standing flaw in Xenomai's
>>>> interrupt handling more visible: We reschedule over the interrupt stack
>>>> in the Xenomai interrupt handler tails, at least on x86-64. Not sure if
>>>> other archs have interrupt stacks, the point is Xenomai's design wrongly
>>>> assumes there are no such things.
>>>
>>> Fortunately, no, this is not a design issue, no such assumption was ever
>>> made, but the Xenomai core expects this to be handled on a per-arch
>>> basis with the interrupt pipeline.
>>
>> And that's already the problem: If Linux uses interrupt stacks, relying 
>> on ipipe to disable this during Xenomai interrupt handler execution is 
>> at best a workaround. A fragile one unless you increase the pre-thread 
>> stack size by the size of the interrupt stack. Lacking support for a 
>> generic rescheduling hook became a problem by the time Linux introduced 
>> interrupt threads.
> 
> Don't assume too much. What was done for ppc64 was not meant as a
> general policy. Again, this is a per-arch decision.

Actually, it was the right decision, not only for ppc64: Reusing Linux
interrupt stacks for Xenomai does not work. If we interrupt Linux while
it is already running over the interrupt stack, the stack becomes taboo
on that CPU. From that point on, no RT IRQ must run over the Linux
interrupt stack as it would smash it.

But then the question is why we should try to use the interrupt stacks
for Xenomai at all. It's better to increase the task kernel stacks and
disable interrupt stacks when ipipe is enabled. That's what I'm heading
for with x86-64 now (THREAD_ORDER 2, no stack switching).

What we may do is introducing per-domain interrupt stacks. But that's at
best Xenomai 3 / I-pipe 3 stuff.

Jan



signature.asc
Description: OpenPGP digital signature
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix race between gatekeeper and thread deletion

2011-07-16 Thread Jan Kiszka

On 2011-07-16 10:52, Philippe Gerum wrote:

On Sat, 2011-07-16 at 10:13 +0200, Jan Kiszka wrote:

On 2011-07-15 15:10, Jan Kiszka wrote:

But... right now it looks like we found our primary regression:
"nucleus/shadow: shorten the uninterruptible path to secondary mode".
It opens a short windows during relax where the migrated task may be
active under both schedulers. We are currently evaluating a revert
(looks good so far), and I need to work out my theory in more
details.


Looks like this commit just made a long-standing flaw in Xenomai's
interrupt handling more visible: We reschedule over the interrupt stack
in the Xenomai interrupt handler tails, at least on x86-64. Not sure if
other archs have interrupt stacks, the point is Xenomai's design wrongly
assumes there are no such things.


Fortunately, no, this is not a design issue, no such assumption was ever
made, but the Xenomai core expects this to be handled on a per-arch
basis with the interrupt pipeline.


And that's already the problem: If Linux uses interrupt stacks, relying 
on ipipe to disable this during Xenomai interrupt handler execution is 
at best a workaround. A fragile one unless you increase the pre-thread 
stack size by the size of the interrupt stack. Lacking support for a 
generic rescheduling hook became a problem by the time Linux introduced 
interrupt threads.



As you pointed out, there is no way
to handle this via some generic Xenomai-only support.

ppc64 now has separate interrupt stacks, which is why I disabled
IRQSTACKS which became the builtin default at some point. Blackfin goes
through a Xenomai-defined irq tail handler as well, because it may not
reschedule over nested interrupt stacks.


How does this arch prevent that xnpod_schedule in the generic interrupt 
handler tail does its normal work?



Fact is that such pending
problem with x86_64 was overlooked since day #1 by /me.


  We were lucky so far that the values
saved on this shared stack were apparently "compatible", means we were
overwriting them with identical or harmless values. But that's no longer
true when interrupts are hitting us in the xnpod_suspend_thread path of
a relaxing shadow.



Makes sense. It would be better to find a solution that does not make
the relax path uninterruptible again for a significant amount of time.
On low end platforms we support (i.e. non-x86* mainly), this causes
obvious latency spots.


I agree. Conceptually, the interruptible relaxation should be safe now 
after recent fixes.





Likely the only possible fix is establishing a reschedule hook for
Xenomai in the interrupt exit path after the original stack is restored
- - just like Linux works. Requires changes to both ipipe and Xenomai
unfortunately.


__ipipe_run_irqtail() is in the I-pipe core for such purpose. If
instantiated properly for x86_64, and paired with xnarch_escalate() for
that arch as well, it could be an option for running the rescheduling
procedure when safe.


Nope, that doesn't work. The stack is switched later in the return path 
in entry_64.S. We need a hook there, ideally a conditional one, 
controlled by some per-cpu variable that is set by Xenomai on return 
from its interrupt handlers to signal the rescheduling need.


Jan

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix race between gatekeeper and thread deletion

2011-07-16 Thread Jan Kiszka
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 2011-07-15 15:10, Jan Kiszka wrote:
> But... right now it looks like we found our primary regression: 
> "nucleus/shadow: shorten the uninterruptible path to secondary mode".
> It opens a short windows during relax where the migrated task may be
> active under both schedulers. We are currently evaluating a revert
> (looks good so far), and I need to work out my theory in more
> details.

Looks like this commit just made a long-standing flaw in Xenomai's
interrupt handling more visible: We reschedule over the interrupt stack
in the Xenomai interrupt handler tails, at least on x86-64. Not sure if
other archs have interrupt stacks, the point is Xenomai's design wrongly
assumes there are no such things. We were lucky so far that the values
saved on this shared stack were apparently "compatible", means we were
overwriting them with identical or harmless values. But that's no longer
true when interrupts are hitting us in the xnpod_suspend_thread path of
a relaxing shadow.

Likely the only possible fix is establishing a reschedule hook for
Xenomai in the interrupt exit path after the original stack is restored
- - just like Linux works. Requires changes to both ipipe and Xenomai
unfortunately.

Jan

-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.16 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk4hSDsACgkQitSsb3rl5xSmOACfbZfcNKyO9YDvPE+R5H75d0ky
DX0An32BrZW+lpEnxnLLCHSQ5r8itnE9
=n6u8
-END PGP SIGNATURE-

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix race between gatekeeper and thread deletion

2011-07-15 Thread Jan Kiszka
On 2011-07-15 14:30, Gilles Chanteperdrix wrote:
> On 07/14/2011 10:57 PM, Jan Kiszka wrote:
>> On 2011-07-13 21:12, Gilles Chanteperdrix wrote:
>>> On 07/13/2011 09:04 PM, Jan Kiszka wrote:
>>>> On 2011-07-13 20:39, Gilles Chanteperdrix wrote:
>>>>> On 07/12/2011 07:43 PM, Jan Kiszka wrote:
>>>>>> On 2011-07-12 19:38, Gilles Chanteperdrix wrote:
>>>>>>> On 07/12/2011 07:34 PM, Jan Kiszka wrote:
>>>>>>>> On 2011-07-12 19:31, Gilles Chanteperdrix wrote:
>>>>>>>>> On 07/12/2011 02:57 PM, Jan Kiszka wrote:
>>>>>>>>>>  xnlock_put_irqrestore(&nklock, s);
>>>>>>>>>>  xnpod_schedule();
>>>>>>>>>>  }
>>>>>>>>>> @@ -1036,6 +1043,7 @@ redo:
>>>>>>>>>>   * to process this signal anyway.
>>>>>>>>>>   */
>>>>>>>>>>  if (rthal_current_domain == rthal_root_domain) {
>>>>>>>>>> +XENO_BUGON(NUCLEUS, xnthread_test_info(thread, 
>>>>>>>>>> XNATOMIC));
>>>>>>>>>
>>>>>>>>> Misleading dead code again, XNATOMIC is cleared not ten lines above.
>>>>>>>>
>>>>>>>> Nope, I forgot to remove that line.
>>>>>>>>
>>>>>>>>>
>>>>>>>>>>  if (XENO_DEBUG(NUCLEUS) && (!signal_pending(this_task)
>>>>>>>>>>  || this_task->state != TASK_RUNNING))
>>>>>>>>>>  xnpod_fatal
>>>>>>>>>> @@ -1044,6 +1052,8 @@ redo:
>>>>>>>>>>  return -ERESTARTSYS;
>>>>>>>>>>  }
>>>>>>>>>>  
>>>>>>>>>> +xnthread_clear_info(thread, XNATOMIC);
>>>>>>>>>
>>>>>>>>> Why this? I find the xnthread_clear_info(XNATOMIC) right at the right
>>>>>>>>> place at the point it currently is.
>>>>>>>>
>>>>>>>> Nope. Now we either clear XNATOMIC after successful migration or when
>>>>>>>> the signal is about to be sent (ie. in the hook). That way we can test
>>>>>>>> more reliably (TM) in the gatekeeper if the thread can be migrated.
>>>>>>>
>>>>>>> Ok for adding the XNATOMIC test, because it improves the robustness, but
>>>>>>> why changing the way XNATOMIC is set and clear? Chances of breaking
>>>>>>> thing while changing code in this area are really high...
>>>>>>
>>>>>> The current code is (most probably) broken as it does not properly
>>>>>> synchronizes the gatekeeper against a signaled and "runaway" target
>>>>>> Linux task.
>>>>>>
>>>>>> We need an indication if a Linux signal will (or already has) woken up
>>>>>> the to-be-migrated task. That task may have continued over its context,
>>>>>> potentially on a different CPU. Providing this indication is the purpose
>>>>>> of changing where XNATOMIC is cleared.
>>>>>
>>>>> What about synchronizing with the gatekeeper with a semaphore, as done
>>>>> in the first patch you sent, but doing it in xnshadow_harden, as soon as
>>>>> we detect that we are not back from schedule in primary mode? It seems
>>>>> it would avoid any further issue, as we would then be guaranteed that
>>>>> the thread could not switch to TASK_INTERRUPTIBLE again before the
>>>>> gatekeeper is finished.
>>>>
>>>> The problem is that the gatekeeper tests the task state without holding
>>>> the task's rq lock (which is not available to us without a kernel
>>>> patch). That cannot work reliably as long as we accept signals. That's
>>>> why I'm trying to move state change and test under nklock.
>>>>
>>>>>
>>>>> What worries me is the comment in xnshadow_harden:
>>>>>
>>>>>* gatekeeper sent us to primary mode. Since
>>>>>* TASK_UNINTERRUPTIBLE is unavailable to us without wrecking
>>>>>* the runqueue's count of u

Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix race between gatekeeper and thread deletion

2011-07-14 Thread Jan Kiszka
On 2011-07-13 21:12, Gilles Chanteperdrix wrote:
> On 07/13/2011 09:04 PM, Jan Kiszka wrote:
>> On 2011-07-13 20:39, Gilles Chanteperdrix wrote:
>>> On 07/12/2011 07:43 PM, Jan Kiszka wrote:
>>>> On 2011-07-12 19:38, Gilles Chanteperdrix wrote:
>>>>> On 07/12/2011 07:34 PM, Jan Kiszka wrote:
>>>>>> On 2011-07-12 19:31, Gilles Chanteperdrix wrote:
>>>>>>> On 07/12/2011 02:57 PM, Jan Kiszka wrote:
>>>>>>>>xnlock_put_irqrestore(&nklock, s);
>>>>>>>>xnpod_schedule();
>>>>>>>>}
>>>>>>>> @@ -1036,6 +1043,7 @@ redo:
>>>>>>>> * to process this signal anyway.
>>>>>>>> */
>>>>>>>>if (rthal_current_domain == rthal_root_domain) {
>>>>>>>> +  XENO_BUGON(NUCLEUS, xnthread_test_info(thread, 
>>>>>>>> XNATOMIC));
>>>>>>>
>>>>>>> Misleading dead code again, XNATOMIC is cleared not ten lines above.
>>>>>>
>>>>>> Nope, I forgot to remove that line.
>>>>>>
>>>>>>>
>>>>>>>>if (XENO_DEBUG(NUCLEUS) && (!signal_pending(this_task)
>>>>>>>>|| this_task->state != TASK_RUNNING))
>>>>>>>>xnpod_fatal
>>>>>>>> @@ -1044,6 +1052,8 @@ redo:
>>>>>>>>return -ERESTARTSYS;
>>>>>>>>}
>>>>>>>>  
>>>>>>>> +  xnthread_clear_info(thread, XNATOMIC);
>>>>>>>
>>>>>>> Why this? I find the xnthread_clear_info(XNATOMIC) right at the right
>>>>>>> place at the point it currently is.
>>>>>>
>>>>>> Nope. Now we either clear XNATOMIC after successful migration or when
>>>>>> the signal is about to be sent (ie. in the hook). That way we can test
>>>>>> more reliably (TM) in the gatekeeper if the thread can be migrated.
>>>>>
>>>>> Ok for adding the XNATOMIC test, because it improves the robustness, but
>>>>> why changing the way XNATOMIC is set and clear? Chances of breaking
>>>>> thing while changing code in this area are really high...
>>>>
>>>> The current code is (most probably) broken as it does not properly
>>>> synchronizes the gatekeeper against a signaled and "runaway" target
>>>> Linux task.
>>>>
>>>> We need an indication if a Linux signal will (or already has) woken up
>>>> the to-be-migrated task. That task may have continued over its context,
>>>> potentially on a different CPU. Providing this indication is the purpose
>>>> of changing where XNATOMIC is cleared.
>>>
>>> What about synchronizing with the gatekeeper with a semaphore, as done
>>> in the first patch you sent, but doing it in xnshadow_harden, as soon as
>>> we detect that we are not back from schedule in primary mode? It seems
>>> it would avoid any further issue, as we would then be guaranteed that
>>> the thread could not switch to TASK_INTERRUPTIBLE again before the
>>> gatekeeper is finished.
>>
>> The problem is that the gatekeeper tests the task state without holding
>> the task's rq lock (which is not available to us without a kernel
>> patch). That cannot work reliably as long as we accept signals. That's
>> why I'm trying to move state change and test under nklock.
>>
>>>
>>> What worries me is the comment in xnshadow_harden:
>>>
>>>  * gatekeeper sent us to primary mode. Since
>>>  * TASK_UNINTERRUPTIBLE is unavailable to us without wrecking
>>>  * the runqueue's count of uniniterruptible tasks, we just
>>>  * notice the issue and gracefully fail; the caller will have
>>>  * to process this signal anyway.
>>>  */
>>>
>>> Does this mean that we can not switch to TASK_UNINTERRUPTIBLE at this
>>> point? Or simply that TASK_UNINTERRUPTIBLE is not available for the
>>> business of xnshadow_harden?
>>>
>>
>> TASK_UNINTERRUPTIBLE is not available without patching the kernel's
>> scheduler for the reason mentioned in the comment (the scheduler becomes
>> confused and may pick the wro

Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix race between gatekeeper and thread deletion

2011-07-13 Thread Jan Kiszka
On 2011-07-13 20:39, Gilles Chanteperdrix wrote:
> On 07/12/2011 07:43 PM, Jan Kiszka wrote:
>> On 2011-07-12 19:38, Gilles Chanteperdrix wrote:
>>> On 07/12/2011 07:34 PM, Jan Kiszka wrote:
>>>> On 2011-07-12 19:31, Gilles Chanteperdrix wrote:
>>>>> On 07/12/2011 02:57 PM, Jan Kiszka wrote:
>>>>>>  xnlock_put_irqrestore(&nklock, s);
>>>>>>  xnpod_schedule();
>>>>>>  }
>>>>>> @@ -1036,6 +1043,7 @@ redo:
>>>>>>   * to process this signal anyway.
>>>>>>   */
>>>>>>  if (rthal_current_domain == rthal_root_domain) {
>>>>>> +XENO_BUGON(NUCLEUS, xnthread_test_info(thread, 
>>>>>> XNATOMIC));
>>>>>
>>>>> Misleading dead code again, XNATOMIC is cleared not ten lines above.
>>>>
>>>> Nope, I forgot to remove that line.
>>>>
>>>>>
>>>>>>  if (XENO_DEBUG(NUCLEUS) && (!signal_pending(this_task)
>>>>>>  || this_task->state != TASK_RUNNING))
>>>>>>  xnpod_fatal
>>>>>> @@ -1044,6 +1052,8 @@ redo:
>>>>>>  return -ERESTARTSYS;
>>>>>>  }
>>>>>>  
>>>>>> +xnthread_clear_info(thread, XNATOMIC);
>>>>>
>>>>> Why this? I find the xnthread_clear_info(XNATOMIC) right at the right
>>>>> place at the point it currently is.
>>>>
>>>> Nope. Now we either clear XNATOMIC after successful migration or when
>>>> the signal is about to be sent (ie. in the hook). That way we can test
>>>> more reliably (TM) in the gatekeeper if the thread can be migrated.
>>>
>>> Ok for adding the XNATOMIC test, because it improves the robustness, but
>>> why changing the way XNATOMIC is set and clear? Chances of breaking
>>> thing while changing code in this area are really high...
>>
>> The current code is (most probably) broken as it does not properly
>> synchronizes the gatekeeper against a signaled and "runaway" target
>> Linux task.
>>
>> We need an indication if a Linux signal will (or already has) woken up
>> the to-be-migrated task. That task may have continued over its context,
>> potentially on a different CPU. Providing this indication is the purpose
>> of changing where XNATOMIC is cleared.
> 
> What about synchronizing with the gatekeeper with a semaphore, as done
> in the first patch you sent, but doing it in xnshadow_harden, as soon as
> we detect that we are not back from schedule in primary mode? It seems
> it would avoid any further issue, as we would then be guaranteed that
> the thread could not switch to TASK_INTERRUPTIBLE again before the
> gatekeeper is finished.

The problem is that the gatekeeper tests the task state without holding
the task's rq lock (which is not available to us without a kernel
patch). That cannot work reliably as long as we accept signals. That's
why I'm trying to move state change and test under nklock.

> 
> What worries me is the comment in xnshadow_harden:
> 
>* gatekeeper sent us to primary mode. Since
>* TASK_UNINTERRUPTIBLE is unavailable to us without wrecking
>* the runqueue's count of uniniterruptible tasks, we just
>* notice the issue and gracefully fail; the caller will have
>* to process this signal anyway.
>*/
> 
> Does this mean that we can not switch to TASK_UNINTERRUPTIBLE at this
> point? Or simply that TASK_UNINTERRUPTIBLE is not available for the
> business of xnshadow_harden?
> 

TASK_UNINTERRUPTIBLE is not available without patching the kernel's
scheduler for the reason mentioned in the comment (the scheduler becomes
confused and may pick the wrong tasks, IIRC).

But I would refrain from trying to "improve" the gatekeeper design. I've
recently mentioned this to Philippe offlist: For Xenomai 3 with some
ipipe v3, we must rather patch schedule() to enable zero-switch domain
migration. Means: enter the scheduler, let it suspend current and pick
another task, but then simply escalate to the RT domain before doing any
context switch. That's much cheaper than the current design and
hopefully also less error-prone.

Jan



signature.asc
Description: OpenPGP digital signature
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix race between gatekeeper and thread deletion

2011-07-12 Thread Jan Kiszka
On 2011-07-12 19:38, Gilles Chanteperdrix wrote:
> On 07/12/2011 07:34 PM, Jan Kiszka wrote:
>> On 2011-07-12 19:31, Gilles Chanteperdrix wrote:
>>> On 07/12/2011 02:57 PM, Jan Kiszka wrote:
>>>>xnlock_put_irqrestore(&nklock, s);
>>>>xnpod_schedule();
>>>>}
>>>> @@ -1036,6 +1043,7 @@ redo:
>>>> * to process this signal anyway.
>>>> */
>>>>if (rthal_current_domain == rthal_root_domain) {
>>>> +  XENO_BUGON(NUCLEUS, xnthread_test_info(thread, XNATOMIC));
>>>
>>> Misleading dead code again, XNATOMIC is cleared not ten lines above.
>>
>> Nope, I forgot to remove that line.
>>
>>>
>>>>if (XENO_DEBUG(NUCLEUS) && (!signal_pending(this_task)
>>>>|| this_task->state != TASK_RUNNING))
>>>>xnpod_fatal
>>>> @@ -1044,6 +1052,8 @@ redo:
>>>>return -ERESTARTSYS;
>>>>}
>>>>  
>>>> +  xnthread_clear_info(thread, XNATOMIC);
>>>
>>> Why this? I find the xnthread_clear_info(XNATOMIC) right at the right
>>> place at the point it currently is.
>>
>> Nope. Now we either clear XNATOMIC after successful migration or when
>> the signal is about to be sent (ie. in the hook). That way we can test
>> more reliably (TM) in the gatekeeper if the thread can be migrated.
> 
> Ok for adding the XNATOMIC test, because it improves the robustness, but
> why changing the way XNATOMIC is set and clear? Chances of breaking
> thing while changing code in this area are really high...

The current code is (most probably) broken as it does not properly
synchronizes the gatekeeper against a signaled and "runaway" target
Linux task.

We need an indication if a Linux signal will (or already has) woken up
the to-be-migrated task. That task may have continued over its context,
potentially on a different CPU. Providing this indication is the purpose
of changing where XNATOMIC is cleared.

Jan



signature.asc
Description: OpenPGP digital signature
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix race between gatekeeper and thread deletion

2011-07-12 Thread Jan Kiszka
On 2011-07-12 19:31, Gilles Chanteperdrix wrote:
> On 07/12/2011 02:57 PM, Jan Kiszka wrote:
>>  xnlock_put_irqrestore(&nklock, s);
>>  xnpod_schedule();
>>  }
>> @@ -1036,6 +1043,7 @@ redo:
>>   * to process this signal anyway.
>>   */
>>  if (rthal_current_domain == rthal_root_domain) {
>> +XENO_BUGON(NUCLEUS, xnthread_test_info(thread, XNATOMIC));
> 
> Misleading dead code again, XNATOMIC is cleared not ten lines above.

Nope, I forgot to remove that line.

> 
>>  if (XENO_DEBUG(NUCLEUS) && (!signal_pending(this_task)
>>  || this_task->state != TASK_RUNNING))
>>  xnpod_fatal
>> @@ -1044,6 +1052,8 @@ redo:
>>  return -ERESTARTSYS;
>>  }
>>  
>> +xnthread_clear_info(thread, XNATOMIC);
> 
> Why this? I find the xnthread_clear_info(XNATOMIC) right at the right
> place at the point it currently is.

Nope. Now we either clear XNATOMIC after successful migration or when
the signal is about to be sent (ie. in the hook). That way we can test
more reliably (TM) in the gatekeeper if the thread can be migrated.

Jan



signature.asc
Description: OpenPGP digital signature
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix race between gatekeeper and thread deletion

2011-07-12 Thread Jan Kiszka
On 2011-07-12 17:48, Philippe Gerum wrote:
> On Tue, 2011-07-12 at 14:57 +0200, Jan Kiszka wrote:
>> On 2011-07-12 14:13, Jan Kiszka wrote:
>>> On 2011-07-12 14:06, Gilles Chanteperdrix wrote:
>>>> On 07/12/2011 01:58 PM, Jan Kiszka wrote:
>>>>> On 2011-07-12 13:56, Jan Kiszka wrote:
>>>>>> However, this parallel unsynchronized execution of the gatekeeper and
>>>>>> its target thread leaves an increasingly bad feeling on my side. Did we
>>>>>> really catch all corner cases now? I wouldn't guarantee that yet.
>>>>>> Specifically as I still have an obscure crash of a Xenomai thread on
>>>>>> Linux schedule() on my table.
>>>>>>
>>>>>> What if the target thread woke up due to a signal, continued much
>>>>>> further on a different CPU, blocked in TASK_INTERRUPTIBLE, and then the
>>>>>> gatekeeper continued? I wish we could already eliminate this complexity
>>>>>> and do the migration directly inside schedule()...
>>>>>
>>>>> BTW, we do we mask out TASK_ATOMICSWITCH when checking the task state in
>>>>> the gatekeeper? What would happen if we included it (state ==
>>>>> (TASK_ATOMICSWITCH | TASK_INTERRUPTIBLE))?
>>>>
>>>> I would tend to think that what we should check is
>>>> xnthread_test_info(XNATOMIC). Or maybe check both, the interruptible
>>>> state and the XNATOMIC info bit.
>>>
>>> Actually, neither the info bits nor the task state is sufficiently
>>> synchronized against the gatekeeper yet. We need to hold a shared lock
>>> when testing and resetting the state. I'm not sure yet if that is
>>> fixable given the gatekeeper architecture.
>>>
>>
>> This may work (on top of the exit-race fix):
>>
>> diff --git a/ksrc/nucleus/shadow.c b/ksrc/nucleus/shadow.c
>> index 50dcf43..90feb16 100644
>> --- a/ksrc/nucleus/shadow.c
>> +++ b/ksrc/nucleus/shadow.c
>> @@ -913,20 +913,27 @@ static int gatekeeper_thread(void *data)
>>  if ((xnthread_user_task(target)->state & ~TASK_ATOMICSWITCH) == 
>> TASK_INTERRUPTIBLE) {
>>  rpi_pop(target);
>>  xnlock_get_irqsave(&nklock, s);
>> -#ifdef CONFIG_SMP
>> +
>>  /*
>> - * If the task changed its CPU while in
>> - * secondary mode, change the CPU of the
>> - * underlying Xenomai shadow too. We do not
>> - * migrate the thread timers here, it would
>> - * not work. For a "full" migration comprising
>> - * timers, using xnpod_migrate_thread is
>> - * required.
>> + * Recheck XNATOMIC to avoid waking the shadow if the
>> + * Linux task received a signal meanwhile.
>>   */
>> -if (target->sched != sched)
>> -xnsched_migrate_passive(target, sched);
>> +if (xnthread_test_info(target, XNATOMIC)) {
>> +#ifdef CONFIG_SMP
>> +/*
>> + * If the task changed its CPU while in
>> + * secondary mode, change the CPU of the
>> + * underlying Xenomai shadow too. We do not
>> + * migrate the thread timers here, it would
>> + * not work. For a "full" migration comprising
>> + * timers, using xnpod_migrate_thread is
>> + * required.
>> + */
>> +if (target->sched != sched)
>> +xnsched_migrate_passive(target, sched);
>>  #endif /* CONFIG_SMP */
>> -xnpod_resume_thread(target, XNRELAX);
>> +xnpod_resume_thread(target, XNRELAX);
>> +}
>>  xnlock_put_irqrestore(&nklock, s);
>>  xnpod_schedule();
>>  }
>> @@ -1036,6 +1043,7 @@ redo:
>>   * to process this signal anyway.
>>   */
>>  if (rthal_current_domain == rthal_root_domain) {
>> +XENO_BUGON(NUCLEUS, xnthread_test_info(thread, XNATOMIC));
>>  if (XENO_DEBUG(NUCLEUS) && (!signal_pending(this_task)
>>

Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix race between gatekeeper and thread deletion

2011-07-12 Thread Jan Kiszka
On 2011-07-12 14:13, Jan Kiszka wrote:
> On 2011-07-12 14:06, Gilles Chanteperdrix wrote:
>> On 07/12/2011 01:58 PM, Jan Kiszka wrote:
>>> On 2011-07-12 13:56, Jan Kiszka wrote:
>>>> However, this parallel unsynchronized execution of the gatekeeper and
>>>> its target thread leaves an increasingly bad feeling on my side. Did we
>>>> really catch all corner cases now? I wouldn't guarantee that yet.
>>>> Specifically as I still have an obscure crash of a Xenomai thread on
>>>> Linux schedule() on my table.
>>>>
>>>> What if the target thread woke up due to a signal, continued much
>>>> further on a different CPU, blocked in TASK_INTERRUPTIBLE, and then the
>>>> gatekeeper continued? I wish we could already eliminate this complexity
>>>> and do the migration directly inside schedule()...
>>>
>>> BTW, we do we mask out TASK_ATOMICSWITCH when checking the task state in
>>> the gatekeeper? What would happen if we included it (state ==
>>> (TASK_ATOMICSWITCH | TASK_INTERRUPTIBLE))?
>>
>> I would tend to think that what we should check is
>> xnthread_test_info(XNATOMIC). Or maybe check both, the interruptible
>> state and the XNATOMIC info bit.
> 
> Actually, neither the info bits nor the task state is sufficiently
> synchronized against the gatekeeper yet. We need to hold a shared lock
> when testing and resetting the state. I'm not sure yet if that is
> fixable given the gatekeeper architecture.
> 

This may work (on top of the exit-race fix):

diff --git a/ksrc/nucleus/shadow.c b/ksrc/nucleus/shadow.c
index 50dcf43..90feb16 100644
--- a/ksrc/nucleus/shadow.c
+++ b/ksrc/nucleus/shadow.c
@@ -913,20 +913,27 @@ static int gatekeeper_thread(void *data)
if ((xnthread_user_task(target)->state & ~TASK_ATOMICSWITCH) == 
TASK_INTERRUPTIBLE) {
rpi_pop(target);
xnlock_get_irqsave(&nklock, s);
-#ifdef CONFIG_SMP
+
/*
-* If the task changed its CPU while in
-* secondary mode, change the CPU of the
-* underlying Xenomai shadow too. We do not
-* migrate the thread timers here, it would
-* not work. For a "full" migration comprising
-* timers, using xnpod_migrate_thread is
-* required.
+* Recheck XNATOMIC to avoid waking the shadow if the
+* Linux task received a signal meanwhile.
 */
-   if (target->sched != sched)
-   xnsched_migrate_passive(target, sched);
+   if (xnthread_test_info(target, XNATOMIC)) {
+#ifdef CONFIG_SMP
+   /*
+* If the task changed its CPU while in
+* secondary mode, change the CPU of the
+* underlying Xenomai shadow too. We do not
+* migrate the thread timers here, it would
+* not work. For a "full" migration comprising
+* timers, using xnpod_migrate_thread is
+* required.
+*/
+   if (target->sched != sched)
+   xnsched_migrate_passive(target, sched);
 #endif /* CONFIG_SMP */
-   xnpod_resume_thread(target, XNRELAX);
+   xnpod_resume_thread(target, XNRELAX);
+   }
xnlock_put_irqrestore(&nklock, s);
xnpod_schedule();
}
@@ -1036,6 +1043,7 @@ redo:
 * to process this signal anyway.
 */
if (rthal_current_domain == rthal_root_domain) {
+   XENO_BUGON(NUCLEUS, xnthread_test_info(thread, XNATOMIC));
if (XENO_DEBUG(NUCLEUS) && (!signal_pending(this_task)
|| this_task->state != TASK_RUNNING))
xnpod_fatal
@@ -1044,6 +1052,8 @@ redo:
return -ERESTARTSYS;
}
 
+   xnthread_clear_info(thread, XNATOMIC);
+
/* "current" is now running into the Xenomai domain. */
thread->gksched = NULL;
sched = xnsched_finish_unlocked_switch(thread->sched);
@@ -2650,6 +2660,8 @@ static inline void do_sigwake_event(struct task_struct *p)
 
xnlock_get_irqsave(&nklock, s);
 
+   xnthread_clear_info(thread, XNATOMIC);
+
if ((p->ptrace & PT_PTRACED) && !xnthread_test_state(thread, XNDEBUG)) {
   

Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix race between gatekeeper and thread deletion

2011-07-12 Thread Jan Kiszka
On 2011-07-12 14:06, Gilles Chanteperdrix wrote:
> On 07/12/2011 01:58 PM, Jan Kiszka wrote:
>> On 2011-07-12 13:56, Jan Kiszka wrote:
>>> However, this parallel unsynchronized execution of the gatekeeper and
>>> its target thread leaves an increasingly bad feeling on my side. Did we
>>> really catch all corner cases now? I wouldn't guarantee that yet.
>>> Specifically as I still have an obscure crash of a Xenomai thread on
>>> Linux schedule() on my table.
>>>
>>> What if the target thread woke up due to a signal, continued much
>>> further on a different CPU, blocked in TASK_INTERRUPTIBLE, and then the
>>> gatekeeper continued? I wish we could already eliminate this complexity
>>> and do the migration directly inside schedule()...
>>
>> BTW, we do we mask out TASK_ATOMICSWITCH when checking the task state in
>> the gatekeeper? What would happen if we included it (state ==
>> (TASK_ATOMICSWITCH | TASK_INTERRUPTIBLE))?
> 
> I would tend to think that what we should check is
> xnthread_test_info(XNATOMIC). Or maybe check both, the interruptible
> state and the XNATOMIC info bit.

Actually, neither the info bits nor the task state is sufficiently
synchronized against the gatekeeper yet. We need to hold a shared lock
when testing and resetting the state. I'm not sure yet if that is
fixable given the gatekeeper architecture.

Jan



signature.asc
Description: OpenPGP digital signature
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix race between gatekeeper and thread deletion

2011-07-12 Thread Jan Kiszka
On 2011-07-12 13:56, Jan Kiszka wrote:
> However, this parallel unsynchronized execution of the gatekeeper and
> its target thread leaves an increasingly bad feeling on my side. Did we
> really catch all corner cases now? I wouldn't guarantee that yet.
> Specifically as I still have an obscure crash of a Xenomai thread on
> Linux schedule() on my table.
> 
> What if the target thread woke up due to a signal, continued much
> further on a different CPU, blocked in TASK_INTERRUPTIBLE, and then the
> gatekeeper continued? I wish we could already eliminate this complexity
> and do the migration directly inside schedule()...

BTW, we do we mask out TASK_ATOMICSWITCH when checking the task state in
the gatekeeper? What would happen if we included it (state ==
(TASK_ATOMICSWITCH | TASK_INTERRUPTIBLE))?

Jan



signature.asc
Description: OpenPGP digital signature
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix race between gatekeeper and thread deletion

2011-07-12 Thread Jan Kiszka
On 2011-07-12 13:41, Gilles Chanteperdrix wrote:
> On 07/12/2011 01:29 PM, Jan Kiszka wrote:
>>> I find all this complicated for a very small corner-case, so, I keep
>>> looking for a simpler solution. Let us try something else.
>>
>> It's rather the contrary: this solution is straightforward IMHO.
>>
>>>
>>> If the thread is woken up for whatever reason before the gatekeeper,
>>> then we will hit the following "if" in xnshadow_harden, can we set the
>>> target to NULL at this point if it is the current thread? With a cmpxchg
>>> perhaps?
>>
>> Gatekeeper and target task aren't synchronized, are they?
> 
> That is another reason why the solution to synchronize using the
> semaphore may not work. If the thread hits the call to down before the

"thread" means do_taskexit_event here?

> gatekeeper is woken up, then the gatekeeper test may see the target
> thread as suspended, and we may return from the call to "down" in
> xenomai domain.

Good point, but fortunately not possible: If we block on down() in
do_taskexit_event, we enter TASK_UNINTERRUPTIBLE state. But the
gatekeeper checks for TASK_INTERRUPTIBLE.

However, this parallel unsynchronized execution of the gatekeeper and
its target thread leaves an increasingly bad feeling on my side. Did we
really catch all corner cases now? I wouldn't guarantee that yet.
Specifically as I still have an obscure crash of a Xenomai thread on
Linux schedule() on my table.

What if the target thread woke up due to a signal, continued much
further on a different CPU, blocked in TASK_INTERRUPTIBLE, and then the
gatekeeper continued? I wish we could already eliminate this complexity
and do the migration directly inside schedule()...

Jan



signature.asc
Description: OpenPGP digital signature
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix race between gatekeeper and thread deletion

2011-07-12 Thread Jan Kiszka
On 2011-07-12 13:26, Gilles Chanteperdrix wrote:
> On 07/12/2011 01:10 PM, Jan Kiszka wrote:
>> On 2011-07-12 13:08, Gilles Chanteperdrix wrote:
>>> On 07/12/2011 01:06 PM, Jan Kiszka wrote:
>>>> On 2011-07-12 13:04, Gilles Chanteperdrix wrote:
>>>>> On 07/12/2011 01:00 PM, Jan Kiszka wrote:
>>>>>> On 2011-07-12 12:59, Gilles Chanteperdrix wrote:
>>>>>>> On 07/12/2011 09:22 AM, Jan Kiszka wrote:
>>>>>>>> On 2011-07-12 08:41, Gilles Chanteperdrix wrote:
>>>>>>>>> On 07/11/2011 10:12 PM, Jan Kiszka wrote:
>>>>>>>>>> On 2011-07-11 22:09, Gilles Chanteperdrix wrote:
>>>>>>>>>>> On 07/11/2011 10:06 PM, Jan Kiszka wrote:
>>>>>>>>>>>> On 2011-07-11 22:02, Gilles Chanteperdrix wrote:
>>>>>>>>>>>>> On 07/11/2011 09:59 PM, Jan Kiszka wrote:
>>>>>>>>>>>>>> On 2011-07-11 21:51, Gilles Chanteperdrix wrote:
>>>>>>>>>>>>>>> On 07/11/2011 09:16 PM, Jan Kiszka wrote:
>>>>>>>>>>>>>>>> On 2011-07-11 21:10, Jan Kiszka wrote:
>>>>>>>>>>>>>>>>> On 2011-07-11 20:53, Gilles Chanteperdrix wrote:
>>>>>>>>>>>>>>>>>> On 07/08/2011 06:29 PM, GIT version control wrote:
>>>>>>>>>>>>>>>>>>> @@ -2528,6 +2534,22 @@ static inline void 
>>>>>>>>>>>>>>>>>>> do_taskexit_event(struct task_struct *p)
>>>>>>>>>>>>>>>>>>> magic = xnthread_get_magic(thread);
>>>>>>>>>>>>>>>>>>>  
>>>>>>>>>>>>>>>>>>> xnlock_get_irqsave(&nklock, s);
>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>> +   gksched = thread->gksched;
>>>>>>>>>>>>>>>>>>> +   if (gksched) {
>>>>>>>>>>>>>>>>>>> +   xnlock_put_irqrestore(&nklock, s);
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Are we sure irqs are on here? Are you sure that what is 
>>>>>>>>>>>>>>>>>> needed is not an
>>>>>>>>>>>>>>>>>> xnlock_clear_irqon?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> We are in the context of do_exit. Not only IRQs are on, also 
>>>>>>>>>>>>>>>>> preemption.
>>>>>>>>>>>>>>>>> And surely no nklock is held.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Furthermore, I do not understand how we
>>>>>>>>>>>>>>>>>> "synchronize" with the gatekeeper, how is the gatekeeper 
>>>>>>>>>>>>>>>>>> garanteed to
>>>>>>>>>>>>>>>>>> wait for this assignment?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> The gatekeeper holds the gksync token while it's active. We 
>>>>>>>>>>>>>>>>> request it,
>>>>>>>>>>>>>>>>> thus we wait for the gatekeeper to become idle again. While 
>>>>>>>>>>>>>>>>> it is idle,
>>>>>>>>>>>>>>>>> we reset the queued reference - but I just realized that this 
>>>>>>>>>>>>>>>>> may tramp
>>>>>>>>>>>>>>>>> on other tasks' values. I need to add a check that the value 
>>>>>>>>>>>>>>>>> to be
>>>>>>>>>>>>>>>>> null'ified is actually still ours.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thinking again, that'

Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix race between gatekeeper and thread deletion

2011-07-12 Thread Jan Kiszka
On 2011-07-12 13:08, Gilles Chanteperdrix wrote:
> On 07/12/2011 01:06 PM, Jan Kiszka wrote:
>> On 2011-07-12 13:04, Gilles Chanteperdrix wrote:
>>> On 07/12/2011 01:00 PM, Jan Kiszka wrote:
>>>> On 2011-07-12 12:59, Gilles Chanteperdrix wrote:
>>>>> On 07/12/2011 09:22 AM, Jan Kiszka wrote:
>>>>>> On 2011-07-12 08:41, Gilles Chanteperdrix wrote:
>>>>>>> On 07/11/2011 10:12 PM, Jan Kiszka wrote:
>>>>>>>> On 2011-07-11 22:09, Gilles Chanteperdrix wrote:
>>>>>>>>> On 07/11/2011 10:06 PM, Jan Kiszka wrote:
>>>>>>>>>> On 2011-07-11 22:02, Gilles Chanteperdrix wrote:
>>>>>>>>>>> On 07/11/2011 09:59 PM, Jan Kiszka wrote:
>>>>>>>>>>>> On 2011-07-11 21:51, Gilles Chanteperdrix wrote:
>>>>>>>>>>>>> On 07/11/2011 09:16 PM, Jan Kiszka wrote:
>>>>>>>>>>>>>> On 2011-07-11 21:10, Jan Kiszka wrote:
>>>>>>>>>>>>>>> On 2011-07-11 20:53, Gilles Chanteperdrix wrote:
>>>>>>>>>>>>>>>> On 07/08/2011 06:29 PM, GIT version control wrote:
>>>>>>>>>>>>>>>>> @@ -2528,6 +2534,22 @@ static inline void 
>>>>>>>>>>>>>>>>> do_taskexit_event(struct task_struct *p)
>>>>>>>>>>>>>>>>>   magic = xnthread_get_magic(thread);
>>>>>>>>>>>>>>>>>  
>>>>>>>>>>>>>>>>>   xnlock_get_irqsave(&nklock, s);
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>> + gksched = thread->gksched;
>>>>>>>>>>>>>>>>> + if (gksched) {
>>>>>>>>>>>>>>>>> + xnlock_put_irqrestore(&nklock, s);
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Are we sure irqs are on here? Are you sure that what is needed 
>>>>>>>>>>>>>>>> is not an
>>>>>>>>>>>>>>>> xnlock_clear_irqon?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> We are in the context of do_exit. Not only IRQs are on, also 
>>>>>>>>>>>>>>> preemption.
>>>>>>>>>>>>>>> And surely no nklock is held.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Furthermore, I do not understand how we
>>>>>>>>>>>>>>>> "synchronize" with the gatekeeper, how is the gatekeeper 
>>>>>>>>>>>>>>>> garanteed to
>>>>>>>>>>>>>>>> wait for this assignment?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The gatekeeper holds the gksync token while it's active. We 
>>>>>>>>>>>>>>> request it,
>>>>>>>>>>>>>>> thus we wait for the gatekeeper to become idle again. While it 
>>>>>>>>>>>>>>> is idle,
>>>>>>>>>>>>>>> we reset the queued reference - but I just realized that this 
>>>>>>>>>>>>>>> may tramp
>>>>>>>>>>>>>>> on other tasks' values. I need to add a check that the value to 
>>>>>>>>>>>>>>> be
>>>>>>>>>>>>>>> null'ified is actually still ours.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thinking again, that's actually not a problem: gktarget is only 
>>>>>>>>>>>>>> needed
>>>>>>>>>>>>>> while gksync is zero - but then we won't get hold of it anyway 
>>>>>>>>>>>>>> and,
>>>>>>>>>>>>>> thus, can't cause any damage.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Well, 

Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix race between gatekeeper and thread deletion

2011-07-12 Thread Jan Kiszka
On 2011-07-12 13:04, Gilles Chanteperdrix wrote:
> On 07/12/2011 01:00 PM, Jan Kiszka wrote:
>> On 2011-07-12 12:59, Gilles Chanteperdrix wrote:
>>> On 07/12/2011 09:22 AM, Jan Kiszka wrote:
>>>> On 2011-07-12 08:41, Gilles Chanteperdrix wrote:
>>>>> On 07/11/2011 10:12 PM, Jan Kiszka wrote:
>>>>>> On 2011-07-11 22:09, Gilles Chanteperdrix wrote:
>>>>>>> On 07/11/2011 10:06 PM, Jan Kiszka wrote:
>>>>>>>> On 2011-07-11 22:02, Gilles Chanteperdrix wrote:
>>>>>>>>> On 07/11/2011 09:59 PM, Jan Kiszka wrote:
>>>>>>>>>> On 2011-07-11 21:51, Gilles Chanteperdrix wrote:
>>>>>>>>>>> On 07/11/2011 09:16 PM, Jan Kiszka wrote:
>>>>>>>>>>>> On 2011-07-11 21:10, Jan Kiszka wrote:
>>>>>>>>>>>>> On 2011-07-11 20:53, Gilles Chanteperdrix wrote:
>>>>>>>>>>>>>> On 07/08/2011 06:29 PM, GIT version control wrote:
>>>>>>>>>>>>>>> @@ -2528,6 +2534,22 @@ static inline void 
>>>>>>>>>>>>>>> do_taskexit_event(struct task_struct *p)
>>>>>>>>>>>>>>> magic = xnthread_get_magic(thread);
>>>>>>>>>>>>>>>  
>>>>>>>>>>>>>>> xnlock_get_irqsave(&nklock, s);
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>> +   gksched = thread->gksched;
>>>>>>>>>>>>>>> +   if (gksched) {
>>>>>>>>>>>>>>> +   xnlock_put_irqrestore(&nklock, s);
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Are we sure irqs are on here? Are you sure that what is needed 
>>>>>>>>>>>>>> is not an
>>>>>>>>>>>>>> xnlock_clear_irqon?
>>>>>>>>>>>>>
>>>>>>>>>>>>> We are in the context of do_exit. Not only IRQs are on, also 
>>>>>>>>>>>>> preemption.
>>>>>>>>>>>>> And surely no nklock is held.
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Furthermore, I do not understand how we
>>>>>>>>>>>>>> "synchronize" with the gatekeeper, how is the gatekeeper 
>>>>>>>>>>>>>> garanteed to
>>>>>>>>>>>>>> wait for this assignment?
>>>>>>>>>>>>>
>>>>>>>>>>>>> The gatekeeper holds the gksync token while it's active. We 
>>>>>>>>>>>>> request it,
>>>>>>>>>>>>> thus we wait for the gatekeeper to become idle again. While it is 
>>>>>>>>>>>>> idle,
>>>>>>>>>>>>> we reset the queued reference - but I just realized that this may 
>>>>>>>>>>>>> tramp
>>>>>>>>>>>>> on other tasks' values. I need to add a check that the value to be
>>>>>>>>>>>>> null'ified is actually still ours.
>>>>>>>>>>>>
>>>>>>>>>>>> Thinking again, that's actually not a problem: gktarget is only 
>>>>>>>>>>>> needed
>>>>>>>>>>>> while gksync is zero - but then we won't get hold of it anyway and,
>>>>>>>>>>>> thus, can't cause any damage.
>>>>>>>>>>>
>>>>>>>>>>> Well, you make it look like it does not work. From what I 
>>>>>>>>>>> understand,
>>>>>>>>>>> what you want is to set gktarget to null if a task being hardened is
>>>>>>>>>>> destroyed. But by waiting for the semaphore, you actually wait for 
>>>>>>>>>>> the
>>>>>>>>>>> harden to be complete, so setting to NULL is useless. Or am I 
>>>>>>>>>>> missing
>>>>>>>>>>> something else?
>>>&

Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix race between gatekeeper and thread deletion

2011-07-12 Thread Jan Kiszka
On 2011-07-12 12:59, Gilles Chanteperdrix wrote:
> On 07/12/2011 09:22 AM, Jan Kiszka wrote:
>> On 2011-07-12 08:41, Gilles Chanteperdrix wrote:
>>> On 07/11/2011 10:12 PM, Jan Kiszka wrote:
>>>> On 2011-07-11 22:09, Gilles Chanteperdrix wrote:
>>>>> On 07/11/2011 10:06 PM, Jan Kiszka wrote:
>>>>>> On 2011-07-11 22:02, Gilles Chanteperdrix wrote:
>>>>>>> On 07/11/2011 09:59 PM, Jan Kiszka wrote:
>>>>>>>> On 2011-07-11 21:51, Gilles Chanteperdrix wrote:
>>>>>>>>> On 07/11/2011 09:16 PM, Jan Kiszka wrote:
>>>>>>>>>> On 2011-07-11 21:10, Jan Kiszka wrote:
>>>>>>>>>>> On 2011-07-11 20:53, Gilles Chanteperdrix wrote:
>>>>>>>>>>>> On 07/08/2011 06:29 PM, GIT version control wrote:
>>>>>>>>>>>>> @@ -2528,6 +2534,22 @@ static inline void 
>>>>>>>>>>>>> do_taskexit_event(struct task_struct *p)
>>>>>>>>>>>>>   magic = xnthread_get_magic(thread);
>>>>>>>>>>>>>  
>>>>>>>>>>>>>   xnlock_get_irqsave(&nklock, s);
>>>>>>>>>>>>> +
>>>>>>>>>>>>> + gksched = thread->gksched;
>>>>>>>>>>>>> + if (gksched) {
>>>>>>>>>>>>> + xnlock_put_irqrestore(&nklock, s);
>>>>>>>>>>>>
>>>>>>>>>>>> Are we sure irqs are on here? Are you sure that what is needed is 
>>>>>>>>>>>> not an
>>>>>>>>>>>> xnlock_clear_irqon?
>>>>>>>>>>>
>>>>>>>>>>> We are in the context of do_exit. Not only IRQs are on, also 
>>>>>>>>>>> preemption.
>>>>>>>>>>> And surely no nklock is held.
>>>>>>>>>>>
>>>>>>>>>>>> Furthermore, I do not understand how we
>>>>>>>>>>>> "synchronize" with the gatekeeper, how is the gatekeeper garanteed 
>>>>>>>>>>>> to
>>>>>>>>>>>> wait for this assignment?
>>>>>>>>>>>
>>>>>>>>>>> The gatekeeper holds the gksync token while it's active. We request 
>>>>>>>>>>> it,
>>>>>>>>>>> thus we wait for the gatekeeper to become idle again. While it is 
>>>>>>>>>>> idle,
>>>>>>>>>>> we reset the queued reference - but I just realized that this may 
>>>>>>>>>>> tramp
>>>>>>>>>>> on other tasks' values. I need to add a check that the value to be
>>>>>>>>>>> null'ified is actually still ours.
>>>>>>>>>>
>>>>>>>>>> Thinking again, that's actually not a problem: gktarget is only 
>>>>>>>>>> needed
>>>>>>>>>> while gksync is zero - but then we won't get hold of it anyway and,
>>>>>>>>>> thus, can't cause any damage.
>>>>>>>>>
>>>>>>>>> Well, you make it look like it does not work. From what I understand,
>>>>>>>>> what you want is to set gktarget to null if a task being hardened is
>>>>>>>>> destroyed. But by waiting for the semaphore, you actually wait for the
>>>>>>>>> harden to be complete, so setting to NULL is useless. Or am I missing
>>>>>>>>> something else?
>>>>>>>>
>>>>>>>> Setting to NULL is probably unneeded but still better than rely on the
>>>>>>>> gatekeeper never waking up spuriously and then dereferencing a stale
>>>>>>>> pointer.
>>>>>>>>
>>>>>>>> The key element of this fix is waitng on gksync, thus on the completion
>>>>>>>> of the non-RT part of the hardening. Actually, this part usually fails
>>>>>>>> as the target task received a termination signal at this point.
>>>>>>>
>>>>>>> Yes, but since you wait on the completion of the h

Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix race between gatekeeper and thread deletion

2011-07-12 Thread Jan Kiszka
On 2011-07-12 08:41, Gilles Chanteperdrix wrote:
> On 07/11/2011 10:12 PM, Jan Kiszka wrote:
>> On 2011-07-11 22:09, Gilles Chanteperdrix wrote:
>>> On 07/11/2011 10:06 PM, Jan Kiszka wrote:
>>>> On 2011-07-11 22:02, Gilles Chanteperdrix wrote:
>>>>> On 07/11/2011 09:59 PM, Jan Kiszka wrote:
>>>>>> On 2011-07-11 21:51, Gilles Chanteperdrix wrote:
>>>>>>> On 07/11/2011 09:16 PM, Jan Kiszka wrote:
>>>>>>>> On 2011-07-11 21:10, Jan Kiszka wrote:
>>>>>>>>> On 2011-07-11 20:53, Gilles Chanteperdrix wrote:
>>>>>>>>>> On 07/08/2011 06:29 PM, GIT version control wrote:
>>>>>>>>>>> @@ -2528,6 +2534,22 @@ static inline void do_taskexit_event(struct 
>>>>>>>>>>> task_struct *p)
>>>>>>>>>>> magic = xnthread_get_magic(thread);
>>>>>>>>>>>  
>>>>>>>>>>> xnlock_get_irqsave(&nklock, s);
>>>>>>>>>>> +
>>>>>>>>>>> +   gksched = thread->gksched;
>>>>>>>>>>> +   if (gksched) {
>>>>>>>>>>> +   xnlock_put_irqrestore(&nklock, s);
>>>>>>>>>>
>>>>>>>>>> Are we sure irqs are on here? Are you sure that what is needed is 
>>>>>>>>>> not an
>>>>>>>>>> xnlock_clear_irqon?
>>>>>>>>>
>>>>>>>>> We are in the context of do_exit. Not only IRQs are on, also 
>>>>>>>>> preemption.
>>>>>>>>> And surely no nklock is held.
>>>>>>>>>
>>>>>>>>>> Furthermore, I do not understand how we
>>>>>>>>>> "synchronize" with the gatekeeper, how is the gatekeeper garanteed to
>>>>>>>>>> wait for this assignment?
>>>>>>>>>
>>>>>>>>> The gatekeeper holds the gksync token while it's active. We request 
>>>>>>>>> it,
>>>>>>>>> thus we wait for the gatekeeper to become idle again. While it is 
>>>>>>>>> idle,
>>>>>>>>> we reset the queued reference - but I just realized that this may 
>>>>>>>>> tramp
>>>>>>>>> on other tasks' values. I need to add a check that the value to be
>>>>>>>>> null'ified is actually still ours.
>>>>>>>>
>>>>>>>> Thinking again, that's actually not a problem: gktarget is only needed
>>>>>>>> while gksync is zero - but then we won't get hold of it anyway and,
>>>>>>>> thus, can't cause any damage.
>>>>>>>
>>>>>>> Well, you make it look like it does not work. From what I understand,
>>>>>>> what you want is to set gktarget to null if a task being hardened is
>>>>>>> destroyed. But by waiting for the semaphore, you actually wait for the
>>>>>>> harden to be complete, so setting to NULL is useless. Or am I missing
>>>>>>> something else?
>>>>>>
>>>>>> Setting to NULL is probably unneeded but still better than rely on the
>>>>>> gatekeeper never waking up spuriously and then dereferencing a stale
>>>>>> pointer.
>>>>>>
>>>>>> The key element of this fix is waitng on gksync, thus on the completion
>>>>>> of the non-RT part of the hardening. Actually, this part usually fails
>>>>>> as the target task received a termination signal at this point.
>>>>>
>>>>> Yes, but since you wait on the completion of the hardening, the test
>>>>> if (target &&...) in the gatekeeper code will always be true, because at
>>>>> this point the cleanup code will still be waiting for the semaphore.
>>>>
>>>> Yes, except we will ever wake up the gatekeeper later on without an
>>>> updated gktarget, ie. spuriously. Better safe than sorry, this is hairy
>>>> code anyway (hopefully obsolete one day).
>>>
>>> The gatekeeper is not woken up by posting the semaphore, the gatekeeper
>>> is woken up by the thread which is going to be hardened (and this thread
>>> is the one which waits for the semaphore).
>>
>> All true. And what is the point?
> 
> The point being, would not something like this patch be sufficient?
> 
> diff --git a/ksrc/nucleus/shadow.c b/ksrc/nucleus/shadow.c
> index 01f4200..4742c02 100644
> --- a/ksrc/nucleus/shadow.c
> +++ b/ksrc/nucleus/shadow.c
> @@ -2527,6 +2527,18 @@ static inline void do_taskexit_event(struct
> task_struct *p)
>   magic = xnthread_get_magic(thread);
> 
>   xnlock_get_irqsave(&nklock, s);
> + if (xnthread_test_info(thread, XNATOMIC)) {
> + struct xnsched *gksched = xnpod_sched_slot(task_cpu(p));

That's not reliable, the task might have been migrated by Linux in the
meantime. We must use the stored gksched.

> + xnlock_put_irqrestore(&nklock, s);
> +
> + /* Thread is in flight to primary mode, wait for the
> +gatekeeper to be done with it. */
> + down(&gksched->gksync);
> + up(&gksched->gksync);
> +
> + xnlock_get_irqsave(&nklock, s);
> + }
> +
>   /* Prevent wakeup call from xnshadow_unmap(). */
>   xnshadow_thrptd(p) = NULL;
>   xnthread_archtcb(thread)->user_task = NULL;
> 

Again, setting gktarget to NULL and testing for NULL is simply safer,
and I see no gain in skipping that. But if you prefer the
micro-optimization, I'll drop it.

Jan



signature.asc
Description: OpenPGP digital signature
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix race between gatekeeper and thread deletion

2011-07-11 Thread Jan Kiszka
On 2011-07-11 22:09, Gilles Chanteperdrix wrote:
> On 07/11/2011 10:06 PM, Jan Kiszka wrote:
>> On 2011-07-11 22:02, Gilles Chanteperdrix wrote:
>>> On 07/11/2011 09:59 PM, Jan Kiszka wrote:
>>>> On 2011-07-11 21:51, Gilles Chanteperdrix wrote:
>>>>> On 07/11/2011 09:16 PM, Jan Kiszka wrote:
>>>>>> On 2011-07-11 21:10, Jan Kiszka wrote:
>>>>>>> On 2011-07-11 20:53, Gilles Chanteperdrix wrote:
>>>>>>>> On 07/08/2011 06:29 PM, GIT version control wrote:
>>>>>>>>> @@ -2528,6 +2534,22 @@ static inline void do_taskexit_event(struct 
>>>>>>>>> task_struct *p)
>>>>>>>>>   magic = xnthread_get_magic(thread);
>>>>>>>>>  
>>>>>>>>>   xnlock_get_irqsave(&nklock, s);
>>>>>>>>> +
>>>>>>>>> + gksched = thread->gksched;
>>>>>>>>> + if (gksched) {
>>>>>>>>> + xnlock_put_irqrestore(&nklock, s);
>>>>>>>>
>>>>>>>> Are we sure irqs are on here? Are you sure that what is needed is not 
>>>>>>>> an
>>>>>>>> xnlock_clear_irqon?
>>>>>>>
>>>>>>> We are in the context of do_exit. Not only IRQs are on, also preemption.
>>>>>>> And surely no nklock is held.
>>>>>>>
>>>>>>>> Furthermore, I do not understand how we
>>>>>>>> "synchronize" with the gatekeeper, how is the gatekeeper garanteed to
>>>>>>>> wait for this assignment?
>>>>>>>
>>>>>>> The gatekeeper holds the gksync token while it's active. We request it,
>>>>>>> thus we wait for the gatekeeper to become idle again. While it is idle,
>>>>>>> we reset the queued reference - but I just realized that this may tramp
>>>>>>> on other tasks' values. I need to add a check that the value to be
>>>>>>> null'ified is actually still ours.
>>>>>>
>>>>>> Thinking again, that's actually not a problem: gktarget is only needed
>>>>>> while gksync is zero - but then we won't get hold of it anyway and,
>>>>>> thus, can't cause any damage.
>>>>>
>>>>> Well, you make it look like it does not work. From what I understand,
>>>>> what you want is to set gktarget to null if a task being hardened is
>>>>> destroyed. But by waiting for the semaphore, you actually wait for the
>>>>> harden to be complete, so setting to NULL is useless. Or am I missing
>>>>> something else?
>>>>
>>>> Setting to NULL is probably unneeded but still better than rely on the
>>>> gatekeeper never waking up spuriously and then dereferencing a stale
>>>> pointer.
>>>>
>>>> The key element of this fix is waitng on gksync, thus on the completion
>>>> of the non-RT part of the hardening. Actually, this part usually fails
>>>> as the target task received a termination signal at this point.
>>>
>>> Yes, but since you wait on the completion of the hardening, the test
>>> if (target &&...) in the gatekeeper code will always be true, because at
>>> this point the cleanup code will still be waiting for the semaphore.
>>
>> Yes, except we will ever wake up the gatekeeper later on without an
>> updated gktarget, ie. spuriously. Better safe than sorry, this is hairy
>> code anyway (hopefully obsolete one day).
> 
> The gatekeeper is not woken up by posting the semaphore, the gatekeeper
> is woken up by the thread which is going to be hardened (and this thread
> is the one which waits for the semaphore).

All true. And what is the point?

Jan



signature.asc
Description: OpenPGP digital signature
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix race between gatekeeper and thread deletion

2011-07-11 Thread Jan Kiszka
On 2011-07-11 22:02, Gilles Chanteperdrix wrote:
> On 07/11/2011 09:59 PM, Jan Kiszka wrote:
>> On 2011-07-11 21:51, Gilles Chanteperdrix wrote:
>>> On 07/11/2011 09:16 PM, Jan Kiszka wrote:
>>>> On 2011-07-11 21:10, Jan Kiszka wrote:
>>>>> On 2011-07-11 20:53, Gilles Chanteperdrix wrote:
>>>>>> On 07/08/2011 06:29 PM, GIT version control wrote:
>>>>>>> @@ -2528,6 +2534,22 @@ static inline void do_taskexit_event(struct 
>>>>>>> task_struct *p)
>>>>>>> magic = xnthread_get_magic(thread);
>>>>>>>  
>>>>>>> xnlock_get_irqsave(&nklock, s);
>>>>>>> +
>>>>>>> +   gksched = thread->gksched;
>>>>>>> +   if (gksched) {
>>>>>>> +   xnlock_put_irqrestore(&nklock, s);
>>>>>>
>>>>>> Are we sure irqs are on here? Are you sure that what is needed is not an
>>>>>> xnlock_clear_irqon?
>>>>>
>>>>> We are in the context of do_exit. Not only IRQs are on, also preemption.
>>>>> And surely no nklock is held.
>>>>>
>>>>>> Furthermore, I do not understand how we
>>>>>> "synchronize" with the gatekeeper, how is the gatekeeper garanteed to
>>>>>> wait for this assignment?
>>>>>
>>>>> The gatekeeper holds the gksync token while it's active. We request it,
>>>>> thus we wait for the gatekeeper to become idle again. While it is idle,
>>>>> we reset the queued reference - but I just realized that this may tramp
>>>>> on other tasks' values. I need to add a check that the value to be
>>>>> null'ified is actually still ours.
>>>>
>>>> Thinking again, that's actually not a problem: gktarget is only needed
>>>> while gksync is zero - but then we won't get hold of it anyway and,
>>>> thus, can't cause any damage.
>>>
>>> Well, you make it look like it does not work. From what I understand,
>>> what you want is to set gktarget to null if a task being hardened is
>>> destroyed. But by waiting for the semaphore, you actually wait for the
>>> harden to be complete, so setting to NULL is useless. Or am I missing
>>> something else?
>>
>> Setting to NULL is probably unneeded but still better than rely on the
>> gatekeeper never waking up spuriously and then dereferencing a stale
>> pointer.
>>
>> The key element of this fix is waitng on gksync, thus on the completion
>> of the non-RT part of the hardening. Actually, this part usually fails
>> as the target task received a termination signal at this point.
> 
> Yes, but since you wait on the completion of the hardening, the test
> if (target &&...) in the gatekeeper code will always be true, because at
> this point the cleanup code will still be waiting for the semaphore.

Yes, except we will ever wake up the gatekeeper later on without an
updated gktarget, ie. spuriously. Better safe than sorry, this is hairy
code anyway (hopefully obsolete one day).

Jan



signature.asc
Description: OpenPGP digital signature
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix race between gatekeeper and thread deletion

2011-07-11 Thread Jan Kiszka
On 2011-07-11 21:51, Gilles Chanteperdrix wrote:
> On 07/11/2011 09:16 PM, Jan Kiszka wrote:
>> On 2011-07-11 21:10, Jan Kiszka wrote:
>>> On 2011-07-11 20:53, Gilles Chanteperdrix wrote:
>>>> On 07/08/2011 06:29 PM, GIT version control wrote:
>>>>> @@ -2528,6 +2534,22 @@ static inline void do_taskexit_event(struct 
>>>>> task_struct *p)
>>>>>   magic = xnthread_get_magic(thread);
>>>>>  
>>>>>   xnlock_get_irqsave(&nklock, s);
>>>>> +
>>>>> + gksched = thread->gksched;
>>>>> + if (gksched) {
>>>>> + xnlock_put_irqrestore(&nklock, s);
>>>>
>>>> Are we sure irqs are on here? Are you sure that what is needed is not an
>>>> xnlock_clear_irqon?
>>>
>>> We are in the context of do_exit. Not only IRQs are on, also preemption.
>>> And surely no nklock is held.
>>>
>>>> Furthermore, I do not understand how we
>>>> "synchronize" with the gatekeeper, how is the gatekeeper garanteed to
>>>> wait for this assignment?
>>>
>>> The gatekeeper holds the gksync token while it's active. We request it,
>>> thus we wait for the gatekeeper to become idle again. While it is idle,
>>> we reset the queued reference - but I just realized that this may tramp
>>> on other tasks' values. I need to add a check that the value to be
>>> null'ified is actually still ours.
>>
>> Thinking again, that's actually not a problem: gktarget is only needed
>> while gksync is zero - but then we won't get hold of it anyway and,
>> thus, can't cause any damage.
> 
> Well, you make it look like it does not work. From what I understand,
> what you want is to set gktarget to null if a task being hardened is
> destroyed. But by waiting for the semaphore, you actually wait for the
> harden to be complete, so setting to NULL is useless. Or am I missing
> something else?

Setting to NULL is probably unneeded but still better than rely on the
gatekeeper never waking up spuriously and then dereferencing a stale
pointer.

The key element of this fix is waitng on gksync, thus on the completion
of the non-RT part of the hardening. Actually, this part usually fails
as the target task received a termination signal at this point.

Jan



signature.asc
Description: OpenPGP digital signature
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix race between gatekeeper and thread deletion

2011-07-11 Thread Jan Kiszka
On 2011-07-11 21:10, Jan Kiszka wrote:
> On 2011-07-11 20:53, Gilles Chanteperdrix wrote:
>> On 07/08/2011 06:29 PM, GIT version control wrote:
>>> @@ -2528,6 +2534,22 @@ static inline void do_taskexit_event(struct 
>>> task_struct *p)
>>> magic = xnthread_get_magic(thread);
>>>  
>>> xnlock_get_irqsave(&nklock, s);
>>> +
>>> +   gksched = thread->gksched;
>>> +   if (gksched) {
>>> +   xnlock_put_irqrestore(&nklock, s);
>>
>> Are we sure irqs are on here? Are you sure that what is needed is not an
>> xnlock_clear_irqon?
> 
> We are in the context of do_exit. Not only IRQs are on, also preemption.
> And surely no nklock is held.
> 
>> Furthermore, I do not understand how we
>> "synchronize" with the gatekeeper, how is the gatekeeper garanteed to
>> wait for this assignment?
> 
> The gatekeeper holds the gksync token while it's active. We request it,
> thus we wait for the gatekeeper to become idle again. While it is idle,
> we reset the queued reference - but I just realized that this may tramp
> on other tasks' values. I need to add a check that the value to be
> null'ified is actually still ours.

Thinking again, that's actually not a problem: gktarget is only needed
while gksync is zero - but then we won't get hold of it anyway and,
thus, can't cause any damage.

Jan



signature.asc
Description: OpenPGP digital signature
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix race between gatekeeper and thread deletion

2011-07-11 Thread Jan Kiszka
On 2011-07-11 20:53, Gilles Chanteperdrix wrote:
> On 07/08/2011 06:29 PM, GIT version control wrote:
>> @@ -2528,6 +2534,22 @@ static inline void do_taskexit_event(struct 
>> task_struct *p)
>>  magic = xnthread_get_magic(thread);
>>  
>>  xnlock_get_irqsave(&nklock, s);
>> +
>> +gksched = thread->gksched;
>> +if (gksched) {
>> +xnlock_put_irqrestore(&nklock, s);
> 
> Are we sure irqs are on here? Are you sure that what is needed is not an
> xnlock_clear_irqon?

We are in the context of do_exit. Not only IRQs are on, also preemption.
And surely no nklock is held.

> Furthermore, I do not understand how we
> "synchronize" with the gatekeeper, how is the gatekeeper garanteed to
> wait for this assignment?

The gatekeeper holds the gksync token while it's active. We request it,
thus we wait for the gatekeeper to become idle again. While it is idle,
we reset the queued reference - but I just realized that this may tramp
on other tasks' values. I need to add a check that the value to be
null'ified is actually still ours.

Jan



signature.asc
Description: OpenPGP digital signature
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] [PATCH 2/2] native: Fix error cleanup of rt_task_create

2011-06-30 Thread Jan Kiszka
On 2011-06-30 13:09, Gilles Chanteperdrix wrote:
> On 06/30/2011 11:36 AM, Jan Kiszka wrote:
>> When creating of a shadow task fails, rt_task_create has to free the
>> task object consistently, not only on registry errors. Then we need to
>> delete the core thread when fastlock allocation failed. Moreover, fix a
>> double free of the fastlock object which is now released via the delete
>> hook. Finally, avoid a use-after-release of the fastlock object in
>> __task_delete_hook.
>>
>> This fixes heap corruptions when running out of resources.
>>
>> Signed-off-by: Jan Kiszka 
>> ---
>> (...)
>> +
>> +  fail:
>> +if (xnthread_test_state(&task->thread_base, XNSHADOW))
>> +xnfree(task);
>> +
>> +return err;
>>  }
>>  
>>  /**
> 
> Is this needed? I mean, shadows are created in syscall.c, function
> __rt_task_create, and when rt_task_create returns an error, that
> function calls rt_task_delete. So, there should be no leak. And worse,
> here rt_task_delete will use an invalid pointer if we apply that patch.

The problem is that rt_task_create may both fail early without any
registration step performed yet and very late when everything (except
the the registry entry) is already set up. There is no chance for the
caller __rt_task_create to figure out the precise task registration
state and properly roll it back. That must be done by rt_task_create.

This patch ensures that shadow tasks are either successfully registered
or completely deleted on error. Among other bug scenarios, this avoids
deleting the task twice, first via xnpod_delete_thread+deletion hook and
then again via xnfree in the error path of __rt_task_create.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


[Xenomai-core] [PATCH 1/2] native: Do not acquire non-existent MPS fastlock

2011-06-30 Thread Jan Kiszka
Fix a build warning at this chance as well.

Signed-off-by: Jan Kiszka 
---

Applies on top of xenomai-gch.git gch/u_mode

 ksrc/skins/native/task.c |7 ++-
 1 files changed, 6 insertions(+), 1 deletions(-)

diff --git a/ksrc/skins/native/task.c b/ksrc/skins/native/task.c
index da4e626..af62d56 100644
--- a/ksrc/skins/native/task.c
+++ b/ksrc/skins/native/task.c
@@ -256,7 +256,9 @@ void __native_task_pkg_cleanup(void)
 int rt_task_create(RT_TASK *task,
   const char *name, int stksize, int prio, int mode)
 {
+#if defined(CONFIG_XENO_OPT_NATIVE_MPS) && defined(CONFIG_XENO_FASTSYNCH)
xnarch_atomic_t *fastlock = NULL;
+#endif
union xnsched_policy_param param;
struct xnthread_init_attr attr;
int err = 0, cpumask, cpu;
@@ -336,8 +338,11 @@ int rt_task_create(RT_TASK *task,
xnheap_free(&xnsys_ppd_get(0)->sem_heap, fastlock);
 #endif
xnpod_delete_thread(&task->thread_base);
-   } else
+   }
+#if defined(CONFIG_XENO_OPT_NATIVE_MPS) && defined(CONFIG_XENO_FASTSYNCH)
+   else
xnsynch_fast_acquire(fastlock, 
xnthread_handle(&task->thread_base));
+#endif
 
return err;
 }
-- 
1.7.1

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


[Xenomai-core] [PATCH 2/2] native: Fix error cleanup of rt_task_create

2011-06-30 Thread Jan Kiszka
When creating of a shadow task fails, rt_task_create has to free the
task object consistently, not only on registry errors. Then we need to
delete the core thread when fastlock allocation failed. Moreover, fix a
double free of the fastlock object which is now released via the delete
hook. Finally, avoid a use-after-release of the fastlock object in
__task_delete_hook.

This fixes heap corruptions when running out of resources.

Signed-off-by: Jan Kiszka 
---

Applies on top of xenomai-gch.git gch/u_mode

 ksrc/skins/native/syscall.c |4 ++-
 ksrc/skins/native/task.c|   44 --
 2 files changed, 28 insertions(+), 20 deletions(-)

diff --git a/ksrc/skins/native/syscall.c b/ksrc/skins/native/syscall.c
index 17071da..b21705a 100644
--- a/ksrc/skins/native/syscall.c
+++ b/ksrc/skins/native/syscall.c
@@ -184,8 +184,10 @@ static int __rt_task_create(struct pt_regs *regs)
   the platform does not support it. */
 
err = rt_task_create(task, name, 0, prio, XNFPU | XNSHADOW | mode);
-   if (err)
+   if (err) {
+   task = NULL;
goto fail;
+   }
 
/* Apply CPU affinity */
set_cpus_allowed(p, task->affinity);
diff --git a/ksrc/skins/native/task.c b/ksrc/skins/native/task.c
index af62d56..c6e9a22 100644
--- a/ksrc/skins/native/task.c
+++ b/ksrc/skins/native/task.c
@@ -74,16 +74,14 @@ static void __task_delete_hook(xnthread_t *thread)
 
task = thread2rtask(thread);
 
-#if defined(CONFIG_XENO_OPT_NATIVE_MPS) && defined(CONFIG_XENO_FASTSYNCH)
-   xnheap_free(&xnsys_ppd_get(0)->sem_heap,
-   task->msendq.fastlock);
-#endif
-
 #ifdef CONFIG_XENO_OPT_NATIVE_MPS
/* The nucleus will reschedule as needed when all the deletion
   hooks are done. */
xnsynch_destroy(&task->mrecv);
xnsynch_destroy(&task->msendq);
+#ifdef CONFIG_XENO_FASTSYNCH
+   xnheap_free(&xnsys_ppd_get(0)->sem_heap, task->msendq.fastlock);
+#endif
 #endif /* CONFIG_XENO_OPT_NATIVE_MPS */
 
xnsynch_destroy(&task->safesynch);
@@ -265,11 +263,15 @@ int rt_task_create(RT_TASK *task,
xnflags_t bflags;
spl_t s;
 
-   if (prio < T_LOPRIO || prio > T_HIPRIO)
-   return -EINVAL;
+   if (prio < T_LOPRIO || prio > T_HIPRIO) {
+   err = -EINVAL;
+   goto fail;
+   }
 
-   if (xnpod_asynch_p())
-   return -EPERM;
+   if (xnpod_asynch_p()) {
+   err = -EPERM;
+   goto fail;
+   }
 
bflags = mode & (XNFPU | XNSHADOW | XNSUSP);
 
@@ -283,10 +285,10 @@ int rt_task_create(RT_TASK *task,
attr.stacksize = stksize;
param.rt.prio = prio;
 
-   if (xnpod_init_thread(&task->thread_base,
- &attr, &xnsched_class_rt, ¶m) != 0)
-   /* Assume this is the only possible failure. */
-   return -ENOMEM;
+   err = xnpod_init_thread(&task->thread_base, &attr, &xnsched_class_rt,
+   ¶m);
+   if (err)
+   goto fail;
 
inith(&task->link);
task->suspend_depth = (bflags & XNSUSP) ? 1 : 0;
@@ -310,8 +312,10 @@ int rt_task_create(RT_TASK *task,
/* Allocate lock memory for in-kernel use */
fastlock = xnheap_alloc(&xnsys_ppd_get(0)->sem_heap,
sizeof(*fastlock));
-   if (fastlock == NULL)
+   if (fastlock == NULL) {
+   xnpod_delete_thread(&task->thread_base);
return -ENOMEM;
+   }
 #endif /* CONFIG_XENO_FASTSYNCH */
xnsynch_init(&task->mrecv, XNSYNCH_FIFO, NULL);
/*
@@ -333,18 +337,20 @@ int rt_task_create(RT_TASK *task,
   half-baked objects... */
 
err = xnthread_register(&task->thread_base, name ? task->rname : "");
-   if (err) {
-#if defined(CONFIG_XENO_OPT_NATIVE_MPS) && defined(CONFIG_XENO_FASTSYNCH)
-   xnheap_free(&xnsys_ppd_get(0)->sem_heap, fastlock);
-#endif
+   if (err)
xnpod_delete_thread(&task->thread_base);
-   }
 #if defined(CONFIG_XENO_OPT_NATIVE_MPS) && defined(CONFIG_XENO_FASTSYNCH)
else
xnsynch_fast_acquire(fastlock, 
xnthread_handle(&task->thread_base));
 #endif
 
return err;
+
+  fail:
+   if (xnthread_test_state(&task->thread_base, XNSHADOW))
+   xnfree(task);
+
+   return err;
 }
 
 /**
-- 
1.7.1

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Allow drop_u_mode syscall from any context

2011-06-29 Thread Jan Kiszka
On 2011-06-29 09:25, Gilles Chanteperdrix wrote:
> On 06/29/2011 09:06 AM, Jan Kiszka wrote:
>> On 2011-06-28 23:29, Gilles Chanteperdrix wrote:
>>> On 06/28/2011 11:01 PM, GIT version control wrote:
>>>> Module: xenomai-jki
>>>> Branch: for-upstream
>>>> Commit: 5597470d84584846875e8a35309e6302c768addf
>>>> URL:
>>>> http://git.xenomai.org/?p=xenomai-jki.git;a=commit;h=5597470d84584846875e8a35309e6302c768addf
>>>>
>>>> Author: Jan Kiszka 
>>>> Date:   Tue Jun 28 22:10:07 2011 +0200
>>>>
>>>> nucleus: Allow drop_u_mode syscall from any context
>>>>
>>>> xnshadow_sys_drop_u_mode already checks if the caller is a shadow. It
>>>> does that without issuing a warning message if the check fails - in
>>>> contrast to do_hisyscall_event. As user space may call this cleanup
>>>> service even for non-shadow threads (e.g. after shadow creation failed),
>>>> we better silence this warning.
>>>>
>>>> Signed-off-by: Jan Kiszka 
>>>
>>> Jan,
>>>
>>> I have a branch here, which allocates u_mode in the shared heap, so,
>>> this syscall is about to become unnecessary.
>>>
>>> See:
>>> http://git.xenomai.org/?p=xenomai-gch.git;a=shortlog;h=refs/heads/u_mode
>>
>> Even better. When do you plan to merge all this? I'd like to finally fix
>> the various MPS fastlock breakages, specifically as they overlap with
>> other issues there.
> 
> I would like to give a chance to Philippe to have a look at it before it
> is merged (especially the commit using an adeos ptd).

OK.

> 
> Normally, the MPS fastlock issues are solved in this branch.

Only the previously discussed leak. MSP-disabled is still oopsing, and
error clean up is also broken - but not only for MPS.

I'll base my fixes on top of your branch.

Jan



signature.asc
Description: OpenPGP digital signature
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Allow drop_u_mode syscall from any context

2011-06-29 Thread Jan Kiszka
On 2011-06-28 23:29, Gilles Chanteperdrix wrote:
> On 06/28/2011 11:01 PM, GIT version control wrote:
>> Module: xenomai-jki
>> Branch: for-upstream
>> Commit: 5597470d84584846875e8a35309e6302c768addf
>> URL:
>> http://git.xenomai.org/?p=xenomai-jki.git;a=commit;h=5597470d84584846875e8a35309e6302c768addf
>>
>> Author: Jan Kiszka 
>> Date:   Tue Jun 28 22:10:07 2011 +0200
>>
>> nucleus: Allow drop_u_mode syscall from any context
>>
>> xnshadow_sys_drop_u_mode already checks if the caller is a shadow. It
>> does that without issuing a warning message if the check fails - in
>> contrast to do_hisyscall_event. As user space may call this cleanup
>> service even for non-shadow threads (e.g. after shadow creation failed),
>> we better silence this warning.
>>
>> Signed-off-by: Jan Kiszka 
> 
> Jan,
> 
> I have a branch here, which allocates u_mode in the shared heap, so,
> this syscall is about to become unnecessary.
> 
> See:
> http://git.xenomai.org/?p=xenomai-gch.git;a=shortlog;h=refs/heads/u_mode

Even better. When do you plan to merge all this? I'd like to finally fix
the various MPS fastlock breakages, specifically as they overlap with
other issues there.

Jan



signature.asc
Description: OpenPGP digital signature
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] posix: only use rt_printf when in primary mode

2011-06-28 Thread Jan Kiszka
On 2011-06-28 22:48, Jan Kiszka wrote:
> Hi Gilles,
> 
> dynamic printf backend selection is a suboptimal idea when you want
> chronological output (not that uncommon during debugging). The native
> printf may actually hit the screen (or the log file) before some earlier
> queued rt_printf message because the dump thread is busy or has a lower
> priority.

Oh, I missed the buffer flush - that should work.

Sorry for the noise.

Jan



signature.asc
Description: OpenPGP digital signature
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] posix: only use rt_printf when in primary mode

2011-06-28 Thread Jan Kiszka
Hi Gilles,

dynamic printf backend selection is a suboptimal idea when you want
chronological output (not that uncommon during debugging). The native
printf may actually hit the screen (or the log file) before some earlier
queued rt_printf message because the dump thread is busy or has a lower
priority.

It's ok to do native printf for non-shadows. But once a thread was
mapped, printf needs to go through the rt queue, even in secondary mode.

Jan



signature.asc
Description: OpenPGP digital signature
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


  1   2   3   4   5   6   7   8   9   10   >