Hi Bill,
 
This is very interesting.  It appears that you are using duplicate links
and having the links go down.
 
Just a quick question, have you tried to run your application with only
a single link between the two nodes?  Does the same problem occur with
the name table?
 
Do I understand correctly that on node <1.1.3> you are loading your
application over the serial connection and on this node the TIPC link
goes down?  That is very odd.
 
On wireshark, what you might see are some probe messages between <1.1.3>
and <1.1.2>.  I would suspect messages only in one direction for a time
before one side gives up on the other.  I'll have to dig up the format
of a probe message later (have to run to a meeting now).
 
Elmer
 

________________________________

From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of
Kinahan, William P SIK
Sent: Friday, August 31, 2007 5:01 PM
To: [email protected]
Cc: Kinahan, William P SIK; Fimbers, Kevin SIK; Malo, Randy SIK
Subject: [tipc-discussion] Dropped Name Table during link reset.



I'm experiencing some odd behavior using TIPC on VxWorks 6.3 and I'm
hoping someone spots something familiar or can provide me with some
debug advice.

I've got an application that uses TIPC to keep a data set synchronized
across multiple processors in an embedded avionics application. In
general, I've found TIPC quite reliable and I'm happy with the
performance. I've recently encountered the following problem:

Node 1.1.2 is fully operational with my application running 
Node 1.1.3 is running the shell and is in the process of being loaded
with my application 
At or near the end of the load, the links reset for some unknown reason.

It appears that TIPC partially recovers and reestablishes the links. The
name table however is not reestablished on 1.1.3.

The following shell output shows the configuration before and after the
reset. 

-> 
-> 
-> tipcConfig "-nt" 
Type       Lower      Upper      Port Identity              Publication 
-------------------------------- --------------------------
------------------ 
0          16781314   16781314   <1.1.2:1086734332>         1086734333 
           16781315   16781315   <1.1.3:1086734332>         1086734333
zone 
1          1          1          <1.1.3:1086734333>         1086734334
node 
3505       0          0          <1.1.2:1086734327>         1086734328 
3506       0          0          <1.1.2:1086734324>         1086734325 
value = 0 = 0x0 
-> tipcConfig "-nt" 
Type       Lower      Upper      Port Identity              Publication 
-------------------------------- --------------------------
------------------ 
0          16781315   16781315   <1.1.3:1086734332>         1086734333
zone 
1          1          1          <1.1.3:1086734333>         1086734334
node 
value = 0 = 0x0 


-> tipcConfig "-l -nt -n -log" 
Links: 
multicast-link: up 
1.1.3:wancom0-1.1.2:wancom0: up 
1.1.3:wancom1-1.1.2:wancom1: up 
Type       Lower      Upper      Port Identity              Publication 
-------------------------------- --------------------------
------------------ 
0          16781315   16781315   <1.1.3:1086734332>         1086734333
zone 
1          1          1          <1.1.3:1086734333>         1086734334
node 
Nodes known: 
<1.1.2>: up 
Log dump: 
very domain <1.1.0> 
TIPC info: Enabled bearer <eth:wancom1>, discovery domain <1.1.0> 
TIPC info: Established link <1.1.3:wancom0-1.1.2:wancom0> on network
plane A 
TIPC info: Established link <1.1.3:wancom1-1.1.2:wancom1> on network
plane B 
TIPC info: Resetting link <1.1.3:wancom0-1.1.2:wancom0>, requested by
peer 
TIPC info: Lost link <1.1.3:wancom0-1.1.2:wancom0> on network plane A 
TIPC warning: Link changeover error, peer did not permit changeover 
TIPC info: Established link <1.1.3:wancom0-1.1.2:wancom0> on network
plane A 
TIPC info: Resetting link <1.1.3:wancom0-1.1.2:wancom0>, changeover
initiated by peer 
TIPC info: Lost link <1.1.3:wancom0-1.1.2:wancom0> on network plane A 
TIPC info: Resetting link <1.1.3:wancom1-1.1.2:wancom1>, requested by
peer 
TIPC info: Lost link <1.1.3:wancom1-1.1.2:wancom1> on network plane B 
TIPC info: Lost contact with <1.1.2> 
TIPC info: Established link <1.1.3:wancom1-1.1.2:wancom1> on network
plane B 
TIPC info: Established link <1.1.3:wancom0-1.1.2:wancom0> on network
plane A 
value = 0 = 0x0 

Some other oddities: 
- The above output was captured from the shell using a serial port
connection. If I have the target console attached from workbench, I
don't experience this problem. This is true even when the console window
is idle (I'm not typing or executing commnds)

- I do not experience the problem if I disconnect one of the two
ethernet connections (e.g. bearer wancom1) 

The application I'm loading is rather large (22 MB). Is there any
obvious reason the link might want to be reset? Are there any timeout
parameters I might want to consider modifying?

I do have wireshark recordings of both ethernet interfaces at the time
of the failure if someone has suggestions regarding what to look for.

Thanks in advance for any assistance that may be provided. 


Bill Kinahan 
Chief Software Architect 
Sikorsky Aircraft 
(203)386-3551 
Fax (860)998-5575 

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >>  http://get.splunk.com/
_______________________________________________
tipc-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/tipc-discussion

Reply via email to