Leopold Palomo-Avellaneda wrote:
> A Dijous 03 Maig 2007 22:04, Jan Kiszka va escriure:
>> Leopold Palomo Avellaneda wrote:
>>> A Dijous 03 Maig 2007 20:44, Jan Kiszka va escriure:
>>>> [Oops, almost overseen.]
>>>>
>>>> Leopold Palomo-Avellaneda wrote:
>>>>> Hi,
>>>>>
>>>>> some days ago I was sending some messages about the examples of rtnet
>>>>> using rtai. I compiled and installing rtnet with some problems but I
>>>>> think that is working.
>>>>>
>>>>> I modified the init script for have a notdma option because it's the
>>>>> only configuration that I need. I never couldn't run the examples of
>>>>> rtai or generic till yesterday. I modified the examples, created a new
>>>>> examples, and never worked. I have been more or less stopped in this
>>>>> some days.
>>>>>
>>>>> Yesterday, I run in a desperately moment the examples of rtai and
>>>>> generic and wou!!! it worked!!! However I was I bit worried because I
>>>>> thought that I have been losing my time and the rtnet list with my
>>>>> stupids questions. Also, I modified the example to send 100 messages
>>>>> and it ran.
>>>>>
>>>>> Today I tried to run the examples and again it failed. I tried to make
>>>>> a rtnet stop and rtnet notdma again but it failed to unload the module
>>>>> rtipv4. So, I needed to reboot the box. After reboot it, I could run
>>>>> again the examples, but only one time. The next time it failed again,
>>>>> and now I check the /var/log/kern messages and I found this:
>>>>>
>>>>>
>>>>> May  3 10:45:03 ulises kernel: LXRT CHANGED MODE (SYSCALL), PID = 4304,
>>>>> SYSCALL = 4.
>>>>> May  3 10:48:30 ulises kernel: Assertion
>>>>> failed! /root/rtnet-0.9.8/stack/ipv4/udp.c:rt_udp_recvmsg:398 skb !=
>>>>> NULL May  3 10:48:30 ulises kernel: LXRT releases PID 4304 (ID:
>>>>> simpleserver). May  3 10:48:50 ulises kernel: Assertion
>>>>> failed! /root/rtnet-0.9.8/stack/ipv4/udp.c:rt_udp_recvmsg:398 skb !=
>>>>> NULL May  3 10:48:50 ulises kernel: LXRT releases PID 4307 (ID:
>>>>> simpleserver).
>>>> Hmm. What RTAI version? I vaguely recall some bug in RTAI's RTDM layer
>>>> (3.4?) that triggered similar messages. And it would also explain
>>>> indeterministic behaviour your observed (internal event signalling was
>>>> broken).
>>> rtai 3.4. ummmm ugly. Also I found that I can repeat it if the program
>>> fails, or I kill it without finish the rt commands....
>> Vanilla RTAI 3.4* is not usable with RTnet, see below. Pick something
>> more recent.
>>
>>>>> Also, I cannot made a rtnet stop because the script is stopped removing
>>>>> the rtipv4 module.
>>>>>
>>>>> So, this is normal? It's something that I have done wrong? how can I
>>>>> know what is happening?
>>>> Core assertion failures are never normal and indicate bugs underneath.
>>>> Unless you are running an older RTAI version, we would have to dig
>>>> deeper.
>>> Ok, next week I can prepare some test and example to send you.
>> Bug hunting only makes sense if you update or patch your RTAI first.
>> Here is the related thread I recalled:
>>
>> http://thread.gmane.org/gmane.linux.real-time.rtnet.user/2105
>>
> 
> well, following your recommendations I have downloaded the last stable 
> version 
> of rtai 3.5. Compiled and installed without any important problem. The 
> problem persist.
> 
> The idea is the sequent:
> 
> I run a program that open a rt socket. For any reason, mainly because the 
> programmer is a rocky (as me) the program crash,  or simple killed by the 
> user (crtl + c).
> 
> Then I got a message:
> 
> kernel: Assertion 
> failed! /root/rtnet-0.9.8/stack/ipv4/udp.c:rt_udp_recvmsg:398 skb
> kernel: LXRT releases PID 12483 (ID: simpleserver).

Are you sure you are _actually_ running the updated RTAI binaries? I'm
asking for a good reason because I used to spend hours debugging ghost
issues due to inconsistent builds...

Otherwise, this assertion signals that we either have a spurious wakeup
of the receiver (wakeup although no packet was queued) or that the queue
is corrupted. The former points at RTAI (but I really think to remember
that issue was solved), the latter was once a problem with RTnet (0.9.3,
since then everyone is happy with it).

> 
> I can  run the program, but I need to change the port, because, although the 
> program thinks that can use the port, never received some data.

After that assertion everything can happen.

> 
> Also, a problem is that I cannot unload the rtnet modules, because:
> 
> RTcfg: unloaded
> removing loopback...
> RTnet: unregistered rtlo
> RTnet: unregistered rteth0
> RTDM: RTDM: device  still in use - waiting for release...
> 
> so, is this a rtnet bug, a rtai but, or a simple a user bug that corrump the 
> rtai/rtnet modules?
> 

IF it turns out to be a persistent issue for RTAI 3.5, I would suggest
to capture a trace of the previous events when the assertion fires:

CONFIG_IPIPE_TRACE (+TRACE_MCOUNT), "if (!skb) ipipe_trace_freeze(0);"
before the assertion line, sufficiently large
/proc/ipipe/trace/back_trace_points, and then let it go. The result
under /proc/ipipe/trace/frozen would allow a first look back in history
(kernel function calls) and may then inspire further instrumentation
ideas to track the issue down.

[/me kooking at RTAI patches] Hmm, that may not work, they still use an
outdated and broken (/wrt to the tracer) I-pipe patch variant... :(

Jan

Attachment: signature.asc
Description: OpenPGP digital signature

-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
RTnet-users mailing list
RTnet-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rtnet-users

Reply via email to