Leopold Palomo-Avellaneda wrote: > A Dijous 03 Maig 2007 22:04, Jan Kiszka va escriure: >> Leopold Palomo Avellaneda wrote: >>> A Dijous 03 Maig 2007 20:44, Jan Kiszka va escriure: >>>> [Oops, almost overseen.] >>>> >>>> Leopold Palomo-Avellaneda wrote: >>>>> Hi, >>>>> >>>>> some days ago I was sending some messages about the examples of rtnet >>>>> using rtai. I compiled and installing rtnet with some problems but I >>>>> think that is working. >>>>> >>>>> I modified the init script for have a notdma option because it's the >>>>> only configuration that I need. I never couldn't run the examples of >>>>> rtai or generic till yesterday. I modified the examples, created a new >>>>> examples, and never worked. I have been more or less stopped in this >>>>> some days. >>>>> >>>>> Yesterday, I run in a desperately moment the examples of rtai and >>>>> generic and wou!!! it worked!!! However I was I bit worried because I >>>>> thought that I have been losing my time and the rtnet list with my >>>>> stupids questions. Also, I modified the example to send 100 messages >>>>> and it ran. >>>>> >>>>> Today I tried to run the examples and again it failed. I tried to make >>>>> a rtnet stop and rtnet notdma again but it failed to unload the module >>>>> rtipv4. So, I needed to reboot the box. After reboot it, I could run >>>>> again the examples, but only one time. The next time it failed again, >>>>> and now I check the /var/log/kern messages and I found this: >>>>> >>>>> >>>>> May 3 10:45:03 ulises kernel: LXRT CHANGED MODE (SYSCALL), PID = 4304, >>>>> SYSCALL = 4. >>>>> May 3 10:48:30 ulises kernel: Assertion >>>>> failed! /root/rtnet-0.9.8/stack/ipv4/udp.c:rt_udp_recvmsg:398 skb != >>>>> NULL May 3 10:48:30 ulises kernel: LXRT releases PID 4304 (ID: >>>>> simpleserver). May 3 10:48:50 ulises kernel: Assertion >>>>> failed! /root/rtnet-0.9.8/stack/ipv4/udp.c:rt_udp_recvmsg:398 skb != >>>>> NULL May 3 10:48:50 ulises kernel: LXRT releases PID 4307 (ID: >>>>> simpleserver). >>>> Hmm. What RTAI version? I vaguely recall some bug in RTAI's RTDM layer >>>> (3.4?) that triggered similar messages. And it would also explain >>>> indeterministic behaviour your observed (internal event signalling was >>>> broken). >>> rtai 3.4. ummmm ugly. Also I found that I can repeat it if the program >>> fails, or I kill it without finish the rt commands.... >> Vanilla RTAI 3.4* is not usable with RTnet, see below. Pick something >> more recent. >> >>>>> Also, I cannot made a rtnet stop because the script is stopped removing >>>>> the rtipv4 module. >>>>> >>>>> So, this is normal? It's something that I have done wrong? how can I >>>>> know what is happening? >>>> Core assertion failures are never normal and indicate bugs underneath. >>>> Unless you are running an older RTAI version, we would have to dig >>>> deeper. >>> Ok, next week I can prepare some test and example to send you. >> Bug hunting only makes sense if you update or patch your RTAI first. >> Here is the related thread I recalled: >> >> http://thread.gmane.org/gmane.linux.real-time.rtnet.user/2105 >> > > well, following your recommendations I have downloaded the last stable > version > of rtai 3.5. Compiled and installed without any important problem. The > problem persist. > > The idea is the sequent: > > I run a program that open a rt socket. For any reason, mainly because the > programmer is a rocky (as me) the program crash, or simple killed by the > user (crtl + c). > > Then I got a message: > > kernel: Assertion > failed! /root/rtnet-0.9.8/stack/ipv4/udp.c:rt_udp_recvmsg:398 skb > kernel: LXRT releases PID 12483 (ID: simpleserver).
Are you sure you are _actually_ running the updated RTAI binaries? I'm asking for a good reason because I used to spend hours debugging ghost issues due to inconsistent builds... Otherwise, this assertion signals that we either have a spurious wakeup of the receiver (wakeup although no packet was queued) or that the queue is corrupted. The former points at RTAI (but I really think to remember that issue was solved), the latter was once a problem with RTnet (0.9.3, since then everyone is happy with it). > > I can run the program, but I need to change the port, because, although the > program thinks that can use the port, never received some data. After that assertion everything can happen. > > Also, a problem is that I cannot unload the rtnet modules, because: > > RTcfg: unloaded > removing loopback... > RTnet: unregistered rtlo > RTnet: unregistered rteth0 > RTDM: RTDM: device still in use - waiting for release... > > so, is this a rtnet bug, a rtai but, or a simple a user bug that corrump the > rtai/rtnet modules? > IF it turns out to be a persistent issue for RTAI 3.5, I would suggest to capture a trace of the previous events when the assertion fires: CONFIG_IPIPE_TRACE (+TRACE_MCOUNT), "if (!skb) ipipe_trace_freeze(0);" before the assertion line, sufficiently large /proc/ipipe/trace/back_trace_points, and then let it go. The result under /proc/ipipe/trace/frozen would allow a first look back in history (kernel function calls) and may then inspire further instrumentation ideas to track the issue down. [/me kooking at RTAI patches] Hmm, that may not work, they still use an outdated and broken (/wrt to the tracer) I-pipe patch variant... :( Jan
signature.asc
Description: OpenPGP digital signature
------------------------------------------------------------------------- This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/
_______________________________________________ RTnet-users mailing list RTnet-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rtnet-users