Re: [Xenomai-core] [RFC 0/1] Class driver for raw Ethernet packets
On Friday 23 September 2011 13:02:19 Richard Cochran wrote: This patch adds a class driver for raw Ethernet drivers under Xenomai. The goal is to support industrial protocols such as EtherCAT and IEC 61850, where the stack is a user space program needing direct access at the packet level. The class driver offers interfaces for registration, buffer management, and packet sending/receiving. Although this patch is a kind of first draft, still I have it working on the Freescale P2020 with a real world application, with very good results. I can post a patch series for the gianfar driver in the ipipe tree, if anyone is interested. The user space interface is a character device and not a socket, simply because my applications will probably never need fancy socket options. The class driver could surely be made to offer a socket instead, but I think the character is sufficient. The class driver is clearly in the wrong directory within the source tree, but I put it there just to get started. It really does not fit to any of the other drivers, so it probably would need its own place under ksrc/drivers. Thanks in advance for your comments, How does this relate to rtnet, i.e. why didn't you write an rtnet driver ? Peter ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [Xenomai-help] Xenomai v2.5.0 -- Important Notice
On Friday 01 January 2010 12:00:52 Philippe Gerum wrote: Here is Xenomai 2.5.0. ... http://download.gna.org/xenomai/stable/xenomai-2.5.0.tar.bz2 Happy new year, btw. A Big Thank You from the user community to all contributors, Philippe, Gilles and Jan in particular, seems appropriate here, but it hasn't been spoken even after a week. Therefor, I'll unilaterally and only for once assume to be the spokesman of this world-wide fan club and do the honors: A Big Thank You, and a happy new year to you too ! Those who accord, Say Hear, Hear, +1 or Me too, depending on their geological or social situation. The Xenomai User Community. ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] select: native tasks with posix skin mqueues
On Mon, Nov 30, 2009 at 15:20, Peter Soetens pe...@thesourceworks.com wrote: On Thu, Nov 5, 2009 at 02:46, Gilles Chanteperdrix gilles.chanteperd...@xenomai.org wrote: Peter Soetens wrote: Hi, I'm creating my RT threads using the native API and I'm creating mqueues, wrapped to the pthread_rt library. I can read and write the mqueue (and it goes through Xenomai), but when I select() on a receiving mqd_t, the select() calls returns that there is data available on the mq (it fills in the FD_SET), but keeps doing so even when it's empty (the select() is in a loop). Also, it's modeswitching like nuts. I found out that the __wrap_select is correctly called, but returns -EPERM. Kernel sources indicate that this is caused by pse51_current_thread() alias thread2pthread() returning null. Since EPERM is returned to userspace, the __real_select is called from user space, causing the mode switches and bad behaviour. This is almost certainly the thing that native + RTDM + select() is seeing too. My mqueues-only work probably because mq.c only uses pse51_current_thread() in the mq_notify function. I'm guessing that mq_notify would also not work in combination with native skin. I had two options in fixing this: add a xnselector to the native task struct or to the nucleus xnthread_t. I choose the latter, such that every skin kan use select() + RTDM and migrate gradualy to the RTDM and/or Posix skin. I needed to free the xnselector structure in xnpod_delete_thread() , I chose a spot, but it causes a segfault in my native thread (which did the select) during program cleanup. Any advice ? Also, maybe we should separate select() from the posix skin and put it in a separate place (in RTDM as rtdm_select() ?), such that we can start building around it (posix just forwards to rtdm_select() then). A second patch was necessary to return the timeout case properly to userspace (independent of first patch). Tested with native + posix loaded and mq. If you never quit your application, this works :-) Hi, I have included a lightly modified version of this patch on head, I do not see any crash. However, I have some doubts about the current implementation: calling xnselector_destroy() opens opportunities for a rescheduling, which I am not sure is really what we want in the middle of xnpod_delete_thread(). Philippe, what do you think? I'll test this patch this week too. It seems you forgot to apply the second patch, which can go straight into the 2.4 head, since it fixes a bug in the select() wrapping code when timeouts are used. See parent's 0002-posix-Fix-__wrap_select-when-timeout-happens.patch I tested the 2.5 branch with this patch and it works here too. Peter ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
[Xenomai-core] Vague bug report(s) on 2.5-head
Hi guys, I'm using the master branch, 4f42de74 with Linux 2.6.31.1 and the adeos patch from that tree. My app links with both -lnative and -lposix with the wrappers (using xeno-config). I'm experiencing a segfault within pthread_cancel() when calling rt_task_delete(task) of the main() thread (so it deletes its own task), which was init'ed with rt_task_shadow(). When I omit the delete call, the application terminates cleanly. When the app doesn't link with posix wrappers, no segfault either. I didn't have this behaviour 'before' (2.4.10). I don't have crashes when deleting normal RT-threads created with rt_task_create. Program received signal SIGSEGV, Segmentation fault. pthread_cancel (th=1719432289) at pthread_cancel.c:35 35 pthread_cancel.c: No such file or directory. in pthread_cancel.c Current language: auto The current source language is auto; currently c. (gdb) bt #0 pthread_cancel (th=1719432289) at pthread_cancel.c:35 #1 0x74579a94 in rt_task_delete () from /usr/lib/libnative.so.3 #2 0x7766e270 in RTT::os::rtos_task_delete_main (main_task=0x82ab90) at /home/kaltan/src/git/orocos-rtt/src/os/xenomai/fosi_internal.cpp:165 #3 0x77669d1d in ~MainThread (this=0x82ab80, __in_chrg=value optimized out) at /home/kaltan/src/git/orocos-rtt/src/os/MainThread.cpp:55 Maybe related, I also get 'bogus' segfaults in relaxed native task threads when using the CORBA TAO library. Sometimes a 'throw' statement causes a segv, sometimes something else causes it. Identically same code running in plain gnulinux is not a problem, the code is clearly correct. Also not a problem in 2.4.10, but I can't currently (when writing this email) reproduce it. I keep you posted for this. I had these problems first in virtual machines (Sun VBox), but I could reproduce them on the host too. Please tell me so if I should refrain from testing in a VM guest. Peter ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] select: native tasks with posix skin mqueues
On Thu, Nov 5, 2009 at 02:46, Gilles Chanteperdrix gilles.chanteperd...@xenomai.org wrote: Peter Soetens wrote: Hi, I'm creating my RT threads using the native API and I'm creating mqueues, wrapped to the pthread_rt library. I can read and write the mqueue (and it goes through Xenomai), but when I select() on a receiving mqd_t, the select() calls returns that there is data available on the mq (it fills in the FD_SET), but keeps doing so even when it's empty (the select() is in a loop). Also, it's modeswitching like nuts. I found out that the __wrap_select is correctly called, but returns -EPERM. Kernel sources indicate that this is caused by pse51_current_thread() alias thread2pthread() returning null. Since EPERM is returned to userspace, the __real_select is called from user space, causing the mode switches and bad behaviour. This is almost certainly the thing that native + RTDM + select() is seeing too. My mqueues-only work probably because mq.c only uses pse51_current_thread() in the mq_notify function. I'm guessing that mq_notify would also not work in combination with native skin. I had two options in fixing this: add a xnselector to the native task struct or to the nucleus xnthread_t. I choose the latter, such that every skin kan use select() + RTDM and migrate gradualy to the RTDM and/or Posix skin. I needed to free the xnselector structure in xnpod_delete_thread() , I chose a spot, but it causes a segfault in my native thread (which did the select) during program cleanup. Any advice ? Also, maybe we should separate select() from the posix skin and put it in a separate place (in RTDM as rtdm_select() ?), such that we can start building around it (posix just forwards to rtdm_select() then). A second patch was necessary to return the timeout case properly to userspace (independent of first patch). Tested with native + posix loaded and mq. If you never quit your application, this works :-) Hi, I have included a lightly modified version of this patch on head, I do not see any crash. However, I have some doubts about the current implementation: calling xnselector_destroy() opens opportunities for a rescheduling, which I am not sure is really what we want in the middle of xnpod_delete_thread(). Philippe, what do you think? I'll test this patch this week too. It seems you forgot to apply the second patch, which can go straight into the 2.4 head, since it fixes a bug in the select() wrapping code when timeouts are used. See parent's 0002-posix-Fix-__wrap_select-when-timeout-happens.patch Peter ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [Xenomai-help] select: native tasks with posix skin mqueues
On Thu, Oct 1, 2009 at 17:34, Gilles Chanteperdrix gilles.chanteperd...@xenomai.org wrote: Peter Soetens wrote: On Thu, Oct 1, 2009 at 16:47, Gilles Chanteperdrix gilles.chanteperd...@xenomai.org wrote: Peter Soetens wrote: Hi, I'm creating my RT threads using the native API and I'm creating mqueues, wrapped to the pthread_rt library. I can read and write the mqueue (and it goes through Xenomai), but when I select() on a receiving mqd_t, the select() calls returns that there is data available on the mq (it fills in the FD_SET), but keeps doing so even when it's empty (the select() is in a loop). Also, it's modeswitching like nuts. I found out that the __wrap_select is correctly called, but returns -EPERM. Kernel sources indicate that this is caused by pse51_current_thread() alias thread2pthread() returning null. Since EPERM is returned to userspace, the __real_select is called from user space, causing the mode switches and bad behaviour. This is almost certainly the thing that native + RTDM + select() is seeing too. My mqueues-only work probably because mq.c only uses pse51_current_thread() in the mq_notify function. I'm guessing that mq_notify would also not work in combination with native skin. I had two options in fixing this: add a xnselector to the native task struct or to the nucleus xnthread_t. I choose the latter, such that every skin kan use select() + RTDM and migrate gradualy to the RTDM and/or Posix skin. I needed to free the xnselector structure in xnpod_delete_thread() , I chose a spot, but it causes a segfault in my native thread (which did the select) during program cleanup. Any advice ? Also, maybe we should separate select() from the posix skin and put it in a separate place (in RTDM as rtdm_select() ?), such that we can start building around it (posix just forwards to rtdm_select() then). A second patch was necessary to return the timeout case properly to userspace (independent of first patch). Tested with native + posix loaded and mq. If you never quit your application, this works :-) (maybe we discuss this better further on xenomai-core) Ok. Got it now. My idea with that the nucleus service xnselect could be used to implement select-like services which would have different semantics depending on the skins. So, the select service with posix semantics was reserved to posix skin threads. Yes. The segfaults I'm seeing is not related to the cleanup of my xnselector struct in xnpod_delete_thread, because when removing the cleanup code, still leads to the segfault. Probably Posix does something special to let the thread leave select() earlier. To know whether the bug comes from your code or from an unseen bug in xnselect implementation (there is a suspicious access to the xnselector structure when waking up), could you try the same test with the original support, simply using posix skin threads. Will do next week. Other than that, the support is really tied to the posix skin: this version of select will only accept file descriptors which were returned by the posix skin or the rtdm skin. So, I am afraid making it a generic service is a bit hard. I don't have a clear view like you have, but clearly, being able to use select() on the rtdm user API while using any other skin than Posix looks like a big plus to me. I thought this was also in the line of what Philippe was thinking of, ie to center our file descriptors around the rtdm. Moving select() to rtdm, and having the Posix skin as the first 'compatible with rtdm' skin, changes the perspective for the better imho. I'm not looking for making select() generic for all skins, but adapting the existing skins to opt-in for using 'rtdm_select()', which is something we can merge in gradually. Do you think this is feasible/desired ? Peter ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
[Xenomai-core] select: native tasks with posix skin mqueues
Hi, I'm creating my RT threads using the native API and I'm creating mqueues, wrapped to the pthread_rt library. I can read and write the mqueue (and it goes through Xenomai), but when I select() on a receiving mqd_t, the select() calls returns that there is data available on the mq (it fills in the FD_SET), but keeps doing so even when it's empty (the select() is in a loop). Also, it's modeswitching like nuts. I found out that the __wrap_select is correctly called, but returns -EPERM. Kernel sources indicate that this is caused by pse51_current_thread() alias thread2pthread() returning null. Since EPERM is returned to userspace, the __real_select is called from user space, causing the mode switches and bad behaviour. This is almost certainly the thing that native + RTDM + select() is seeing too. My mqueues-only work probably because mq.c only uses pse51_current_thread() in the mq_notify function. I'm guessing that mq_notify would also not work in combination with native skin. I had two options in fixing this: add a xnselector to the native task struct or to the nucleus xnthread_t. I choose the latter, such that every skin kan use select() + RTDM and migrate gradualy to the RTDM and/or Posix skin. I needed to free the xnselector structure in xnpod_delete_thread() , I chose a spot, but it causes a segfault in my native thread (which did the select) during program cleanup. Any advice ? Also, maybe we should separate select() from the posix skin and put it in a separate place (in RTDM as rtdm_select() ?), such that we can start building around it (posix just forwards to rtdm_select() then). A second patch was necessary to return the timeout case properly to userspace (independent of first patch). Tested with native + posix loaded and mq. If you never quit your application, this works :-) (maybe we discuss this better further on xenomai-core) Thanks for the wonderful XUM-2009 experience btw ! Peter From 0380298181f2926e6abe05e6b3d0b02389892a7c Mon Sep 17 00:00:00 2001 From: Peter Soetens pe...@thesourceworks.com Date: Thu, 1 Oct 2009 15:57:54 +0200 Subject: [PATCH] Move posix selector in nucleus for every skin to use. This patch makes the select implementation in syscall.c independent of the posix skin. --- include/nucleus/thread.h |3 +++ ksrc/nucleus/pod.c |6 ++ ksrc/nucleus/thread.c |2 ++ ksrc/skins/posix/syscall.c |6 +++--- ksrc/skins/posix/thread.c |7 --- ksrc/skins/posix/thread.h |4 6 files changed, 14 insertions(+), 14 deletions(-) diff --git a/include/nucleus/thread.h b/include/nucleus/thread.h index 3eefb53..88a14cc 100644 --- a/include/nucleus/thread.h +++ b/include/nucleus/thread.h @@ -142,6 +142,7 @@ struct xnthread; struct xnsched; struct xnsynch; struct xnrpi; +struct xnselector; typedef struct xnthrops { @@ -208,6 +209,8 @@ typedef struct xnthread { xnstat_exectime_t lastperiod; /* Interval marker for execution time reports */ } stat; +struct xnselector *selector;/* For select. */ + int errcode; /* Local errno */ xnasr_t asr; /* Asynchronous service routine */ diff --git a/ksrc/nucleus/pod.c b/ksrc/nucleus/pod.c index 9348ce1..2dd08ca 100644 --- a/ksrc/nucleus/pod.c +++ b/ksrc/nucleus/pod.c @@ -41,6 +41,7 @@ #include nucleus/registry.h #include nucleus/module.h #include nucleus/stat.h +#include nucleus/select.h #include asm/xenomai/bits/pod.h /* debug support */ @@ -1204,6 +1205,11 @@ void xnpod_delete_thread(xnthread_t *thread) xntimer_destroy(thread-rtimer); xntimer_destroy(thread-ptimer); + if (thread-selector) { + xnselector_destroy(thread-selector); + thread-selector = NULL; + } + if (xnthread_test_state(thread, XNPEND)) xnsynch_forget_sleeper(thread); diff --git a/ksrc/nucleus/thread.c b/ksrc/nucleus/thread.c index 5fceaec..ed9722e 100644 --- a/ksrc/nucleus/thread.c +++ b/ksrc/nucleus/thread.c @@ -95,6 +95,8 @@ int xnthread_init(xnthread_t *thread, thread-asr = XNTHREAD_INVALID_ASR; thread-asrlevel = 0; + thread-selector = NULL; + thread-iprio = prio; thread-bprio = prio; thread-cprio = prio; diff --git a/ksrc/skins/posix/syscall.c b/ksrc/skins/posix/syscall.c index d936f21..f4afd29 100644 --- a/ksrc/skins/posix/syscall.c +++ b/ksrc/skins/posix/syscall.c @@ -2382,12 +2382,12 @@ static int __select(struct task_struct *curr, struct pt_regs *regs) xntmode_t mode = XN_RELATIVE; struct xnselector *selector; struct timeval tv; - pthread_t thread; + xnthread_t* thread; int i, err, nfds; size_t fds_size; - thread = pse51_current_thread(); - if (!thread) + thread = xnpod_current_thread(); + if ( !thread ) return -EPERM; if (__xn_reg_arg5(regs)) { diff --git a/ksrc/skins/posix/thread.c b/ksrc/skins/posix/thread.c index ad4aaa5..6e89625 100644 --- a/ksrc/skins/posix/thread.c +++ b/ksrc/skins/posix/thread.c @@ -78,12 +78,6 @@ static void thread_delete_hook(xnthread_t *xnthread) pse51_mark_deleted(thread); pse51_signal_cleanup_thread(thread
Re: [Xenomai-core] RFC: 2.5 todo list.
On Tue, Sep 29, 2009 at 19:31, Gilles Chanteperdrix gilles.chanteperd...@xenomai.org wrote: Hi guys, full of energy after this tremendous first XUM, I would like to start a discussion about what people would like to see in the 2.5 branch. So if we answer positively, we'll delay the release ? I'd rather get 2.5 out, and develop any new stuff on 2.6. I would also expect that this list (or part of ) goes to xenomai-help too. Here is a first list, please feel free to criticize it: - signals in primary domain (something that we almost forgot) I refrain from using signals in my apps. They only cause disaster when using 3rd party libraries. Only Ctrl-C (quit) and debugger signals are used, and a switch from primary to secondary is perfectly acceptable in these two cases. - xnsynch_acquire using atomic_cmpxchg unconditionally (no #ifdefs) This is too core for me. - statistics of all mapped named heaps in /proc/xenomai/heap Don't use heaps since we do all in user space (and I had the impression that heap was for kernel-user.) - unwapped access to user-space posix skin methods I wouldn't know why I need this. Do you we link with libpthread instead of libpthread-rt ? - fast semaphores in user-space I donn't know why I wouldn't need this. - syscall-less select ? Since a syscall is not per-se bad (?) I also don't see what to win here. Actually, there are already a lot of things. So, what do you think? I'm uttermost concerned with stability and to a lesser extent performance. I regard every feature change as changing those two criteria for the worse (unless its a feature that fixes a bug). The kernel and libc are already a moving targets which influence Xenomai, so we already have to cope with changes more than we want to. Peter (non-authorative, non-developer) ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] RFC: 2.5 todo list.
On Wed, Sep 30, 2009 at 16:27, Gilles Chanteperdrix gilles.chanteperd...@xenomai.org wrote: Peter Soetens wrote: On Tue, Sep 29, 2009 at 19:31, Gilles Chanteperdrix gilles.chanteperd...@xenomai.org wrote: Hi guys, full of energy after this tremendous first XUM, I would like to start a discussion about what people would like to see in the 2.5 branch. So if we answer positively, we'll delay the release ? I'd rather get 2.5 out, and develop any new stuff on 2.6. I would also expect that this list (or part of ) goes to xenomai-help too. The facts are: - our release cycle is long; - we want to keep the ABI stable for each branch. So, anything that we want soon and that breaks the ABI should be done in the 2.5 branch, otherwise will have to wait 2.6. Ok, but there is stuff in 2.5 I want soon too, which you would be delaying. Here is a first list, please feel free to criticize it: - signals in primary domain (something that we almost forgot) I refrain from using signals in my apps. They only cause disaster when using 3rd party libraries. Only Ctrl-C (quit) and debugger signals are used, and a switch from primary to secondary is perfectly acceptable in these two cases. Yes, signals is a bit of a misnomer, what we actually want is for the kernel to be able to cause the execution of an asynchronous callback in user-space. For the native skin, it would be for the implementation of some hooks. For the posix skin, it would be for the implementation of signasl. The implementation of posix timers is based on signals (except for SIGEV_THREAD, but who uses SIGEV_THREAD in an rt app...), having them cause a switch to secondary mode make them unusable for practical purposes. So, with the current version of Xenomai posix skin, you have to implement your own timer method, having for instance a thread which nanosleep()s until the next timer expiry and then executes the callback. Ok, but I don't use posix timers for the reasons above. I use clock_nanosleep instead, which offers the same functionality. - xnsynch_acquire using atomic_cmpxchg unconditionally (no #ifdefs) This is too core for me. - statistics of all mapped named heaps in /proc/xenomai/heap Don't use heaps since we do all in user space (and I had the impression that heap was for kernel-user.) - unwapped access to user-space posix skin methods I wouldn't know why I need this. Do you we link with libpthread instead of libpthread-rt ? Well the wrap thing is a bit cumbersome. And having the calls be named with exactly the posix name is useful only if you intend to compile exactly the same code for xenomai and other posix systems. Otherwise, you could decide to use a little prefix or suffix to each posix skin service, and avoid the wrapping clumsyness. So like we did in the RTAI days. Maybe we can use rt_ by (safe!) default and allow a #define in case the users wants to use the wrapping and is aware that he needs to use the wrapping during linking. - fast semaphores in user-space I donn't know why I wouldn't need this. - syscall-less select ? Since a syscall is not per-se bad (?) I also don't see what to win here. syscall are expensive (which is why we do syscall-less mutexes for instance). The idea would be to put the bitfield with the ready file descriptors in a shared heap, to avoid going for the syscall if fds are already ready when entering select(). The scenario where we would gain is on a loaded system, which is exactly when we want to avoid useless syscalls. Then I'm tempted to be in favour, although I'd like to confirm first that select() is not broken as it is now. Are syscalls expensive because I'm running Xenomai, or is this the case in vanilla Linux too ? Do we try to be better than Linux (until they use a similar 'fix' in libc) ? Actually, there are already a lot of things. So, what do you think? I'm uttermost concerned with stability and to a lesser extent performance. I regard every feature change as changing those two criteria for the worse (unless its a feature that fixes a bug). Well... I disagree. Even when fixing bugs we can introduce other bugs. What matters if you aim for stability and performance is improving the tests, not avoiding modifications. You got me. But until the tests are improved, I beg you to be careful ;-) Peter ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] Comedi drivers in Xenomai porting/integration status ?
On Tuesday 17 February 2009 00:00:00 Alexis Berlemont wrote: Hi, Hello all! I would like to know what is the current status of the Comedi port to Xenomai. Should all the specific Comedi drivers (ni_pcimio, ni_mite) be available for testing (by me or someone with a supported DAQ card) and (if ok) for futher integration ? I am still working on that port. It is a long work and I am wondering at each line whether I should rewrite any part of code which does not comply with common coding constraints. Unfortunately, I currently do not have a lot of spare time. Anyway, most of the ni subdevices drivers have been ported (mite, tio, mio, 8255). I am trying to finalize the global driver port. By the way, in the middle of january, I noticed that the legacy Comedi branch found its way into the mainline (through the staging tree). I do not know what will be the future of such a package in mainstream. I assume the main goal is the definition of a global framework for acquisition boards like V4L2 is for video cards. I'm not sure I understand where this is going. We did a review of the Xenomai/Comedi code integration a few weeks ago. These are the facts we observed: * The Xenomai/Comedi port breaks the complete Comedi API, user space *and* kernel space. (We thought/assumed that only the user space interface would go over RTDM and that once that was done, the kernel modules could be almost copy/pasted into the new framework.) * The Xenomai/Comedi port is not supported by 'upstream' (what you call 'legacy'). It's not discussed on their ML, they don't send in patches or feedback. * There aren't any (?) device drivers ported to the Xenomai/Comedi project (public trunk) This is what we concluded: * Xenomai/Comedi has no future as long as it ignores (or is ignored by) upstream. Even after a port of a device driver, pulling fixes from upstream will be hard due to the changed kernel API. * As GKH puts it: all device drivers belong in the Linux kernel. Upstream is doing this right now, which makes acceptance of Xenomai/Comedi unlikely, which makes its life expectations uncertain. * We're now actually considering Preempt/RT as the kernel to use in combination with the original Comedi. We might be stupid, but then again, it might just work. * We believe the name Xenomai/Comedi is strongly misleading. It suggests a painless transition path, but it's a completely different software project, different interfaces, different maintainer(s ?). Sorry for flaming, and please correct me where I'm wrong. Peter -- Peter Soetens -- FMTC -- http://www.fmtc.be ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] Comedi drivers in Xenomai porting/integration status ?
On Tuesday 17 February 2009 10:41:10 Jan Kiszka wrote: Peter Soetens wrote: These are the facts we observed: * The Xenomai/Comedi port breaks the complete Comedi API, user space *and* kernel space. (We thought/assumed that only the user space interface would go over RTDM and that once that was done, the kernel modules could be almost copy/pasted into the new framework.) Maybe you have a list of the major differences. Then please share it so that the motivation can be discussed here and maybe clarified (it's a blurred topic for me as well). Damn. I should have posted back then :-) Our main lead was the Doxygen pages, from which we went on to see how things were done in code. Unfortunately (?), I'm not a Comedi developer, I twiddled only with one Comedi driver. Once. I can't really compare to the bone. But what was clear immediately was that both user API and kernel API were different. User ('Library') API was cleaned up and streamlined, we could live with that for new applications. I'm sure there are still issues, but they'll only come up once people start using this branch. I like the separation between a low level 'instruction' api and a high level 'function' api. Something upstream comedi mixes to much. For kernel ('Driver') API, a new data transfer mechanism is in place, which requires the 'porting' of all drivers. I genuinely can't estimate how drastically this changes existing drivers, but the API is quite huge and works with the 'inversion of control' paradigm: Each driver must implement a series of hooks, and the Comedi/RTDM framework will call these when necessary. Another fact I shouldn't have omitted: * The Xenomai/Comedi layer is very well documented and allows anyone to learn from it, even the upstream maintainers. * Seen from my little device driver knowledge, the technical implementation looks ok for synchronous/asynchronous reading and writing. Memory mapped IO is not available, it seems, and the classical comedi_config, inp,outp,... family isn't complete yet either. * The Xenomai/Comedi port is not supported by 'upstream' (what you call 'legacy'). It's not discussed on their ML, they don't send in patches or feedback. * There aren't any (?) device drivers ported to the Xenomai/Comedi project (public trunk) This is what we concluded: * Xenomai/Comedi has no future as long as it ignores (or is ignored by) upstream. Even after a port of a device driver, pulling fixes from upstream will be hard due to the changed kernel API. IMHO, that heavily depends on the use cases both projects are able to cover. If there are major RT design issues upstream that this variant solves, then people may be willing to live with the differences (including a smaller driver set downstream). It wouldn't be the first time. I see the short term profits as well. I'm fearing a maintenance nightmare in the long term. * As GKH puts it: all device drivers belong in the Linux kernel. Upstream is doing this right now, which makes acceptance of Xenomai/Comedi unlikely, which makes its life expectations uncertain. I don't think anyone expect to see the RT drivers here and around in mainline Linux in the foreseeable future. That's not the primary goal. At the same time, it's still unclear how serious mainline is about RT redesigns of existing drivers or frameworks. And I recall from earlier threads on the comedi list that at least the current comedi maintainers consider RT use at best as a niche and not an important scenario. That's what I recall as well. But I was wondering how the RT issue was overtaken by reality: running plain comedi in a preemptible kernel. * We're now actually considering Preempt/RT as the kernel to use in combination with the original Comedi. We might be stupid, but then again, it might just work. * We believe the name Xenomai/Comedi is strongly misleading. It suggests a painless transition path, but it's a completely different software project, different interfaces, different maintainer(s ?). I agree. If reasons for significant differences in the user API remain, then a different name would be appropriate, too. Looks like this is not Socket-CAN vs. RT-Socket-CAN here? Most functions and structs have been renamed and have modified arguments, although there could be a 1:1 mapping (1 old function - 1 new function). I believe Alexis can defend his design better than anyone else, and it's not the design I wanted to tackle. It's how he plans to maintain it. Peter -- Peter Soetens -- FMTC -- http://www.fmtc.be ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [FIX] Summary: Xenomai 2.3.2 and 2.4 lock-ups and OOPSes
On Saturday 15 September 2007 20:52:59 Philippe Gerum wrote: On Fri, 2007-09-07 at 11:27 +0200, Peter Soetens wrote: Just in case you hooked off the long discussion about the issues we found from Xenomai 2.3.2 on: o We are using the xeno_native skin, create Xeno tasks and semaphores, but have strong indications that the crashes are caused by the memory allocation scheme of Xenomai in combination with task creation/deletion o We found two ways to break Xenomai, causing a 'Killed' (rt_task_delete) and causing an OOPS (rt_task_join). o They happen on 2.6.20 and 2.6.22 kernels o On the 2.3 branch, r2429 works, r2433 causes the faults. The patch is small, and in the ChangLog: Please try this patch against v2.3.x. A double free issue on a task TCB already scheduled for memory release was causing all sorts of troubles, basically trashing the system heap afterwards: Thanks, we'll try, test and report ASAP (= somewhere this week.) Peter -- Peter Soetens -- FMTC -- http://www.fmtc.be ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
[Xenomai-core] Summary: Xenomai 2.3.2 and 2.4 lock-ups and OOPSes
Just in case you hooked off the long discussion about the issues we found from Xenomai 2.3.2 on: o We are using the xeno_native skin, create Xeno tasks and semaphores, but have strong indications that the crashes are caused by the memory allocation scheme of Xenomai in combination with task creation/deletion o We found two ways to break Xenomai, causing a 'Killed' (rt_task_delete) and causing an OOPS (rt_task_join). o They happen on 2.6.20 and 2.6.22 kernels o On the 2.3 branch, r2429 works, r2433 causes the faults. The patch is small, and in the ChangLog: 2007-05-11 Philippe Gerum [EMAIL PROTECTED] * include/nucleus/heap.h (xnfreesafe): Use xnpod_current_p() when checking for deferral. * include/nucleus/pod.h (xnpod_current_p): Give exec mode awareness to this predicate, checking for primary/secondary mode of shadows. 2007-05-11 Gilles Chanteperdrix [EMAIL PROTECTED] * ksrc/skins: Always defer thread memory release in deletion hook by calling xnheap_schedule_free() instead of xnfreesafe(). o We reverted this patch on HEAD of the 2.3 branch, but got -ENOMEM errors during Xenomai resource allocations, indicating that later changes depend on this patch. So we use clean HEAD again further on to find the causes: o A first test (in Orocos) creates one thread, two semaphores, lets it wait on them and cleans up the thread. o During rt_task_delete, our program gets 'Killed' (without joinable thread), hence a user space problem. However, gdb is of no use, all thread info is lost. o We made the thread joinable (T_JOINABLE), and then joined. This bypassed the Kill on the first run but causes an OOPS the second time the same application is started: Oops: [#1] PREEMPT CPU:0 EIP:0060:[fef4a1f3]Not tainted VLI EFLAGS: 00010002 (2.6.20.9-ipipe-1.8-08 #2) EIP is at get_free_range+0x56/0x160 [xeno_nucleus] eax: f3a81d01 ebx: 0200 ecx: 0101 edx: fef62b00 esi: 0101 edi: 0200 ebp: f0f33ec4 esp: f0f33e98 ds: 007b es: 007b ss: 0068 Process NonPeriodicActi (pid: 3020, ti=f0f32000 task=f7ce61b0 task.ti=f0f32000) Stack: 0600 fef62b80 f3a81b24 f3a8 fef62ba4 f3a80720 0101 0600 f0f33f18 f7ce6360 f0f33ee4 fef4a948 fef62b80 f0f33f08 0400 f0f33f18 f7ce6360 f0f33f50 ff13e1de 0282 0282 bfab6350 Call Trace: [c0103ffb] show_trace_log_lvl+0x1f/0x35 [c01040bb] show_stack_log_lvl+0xaa/0xcf [c01042a9] show_registers+0x1c9/0x392 [c0104588] die+0x116/0x245 [c0110fca] do_page_fault+0x287/0x61d [c010ea35] __ipipe_handle_exception+0x63/0x136 [c029466d] error_code+0x79/0x88 [fef4a948] xnheap_alloc+0x15b/0x17d [xeno_nucleus] [ff13e1de] __rt_task_create+0xe0/0x171 [xeno_native] [fef5655f] losyscall_event+0xaf/0x170 [xeno_nucleus] [c0138804] __ipipe_dispatch_event+0xc0/0x1da [c010e90b] __ipipe_syscall_root+0x43/0x10a [c0102e79] system_call+0x29/0x41 === Code: 74 61 85 c0 74 5d c7 45 e0 00 00 00 00 8b 4d e4 8b 49 10 89 4d ec 85 c9 74 38 8b 45 dc 8b 78 0c 89 4d f0 89 ce 89 fb eb 02 89 ce 8b 09 8d 04 3e 39 c1 0f 94 c2 3b 5d d8 0f 92 c0 01 fb 84 c2 75 EIP: [fef4a1f3] get_free_range+0x56/0x160 [xeno_nucleus] SS:ESP 0068:f0f33e98 [hard lockup] o Our application is also mixing the original RT_TASK struct and return value of the rt_task_self() function call when calling rt_ functions. Switching between one of those influences the crashing behaviour as well, not further investigated. o This was reproduced on two different systems (one with SMI workaround working) You have the patch that broke things, I hope this gives you a hint on what causes our crashes. Know that Orocos as-is has worked with Xenomai from Xenomai 2.0 on. Peter -- Peter Soetens -- FMTC -- http://www.fmtc.be ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core