Hi Bill,
This is very interesting. It appears that you are using duplicate links
and having the links go down.
Just a quick question, have you tried to run your application with only
a single link between the two nodes? Does the same problem occur with
the name table?
Do I understand correctly that on node <1.1.3> you are loading your
application over the serial connection and on this node the TIPC link
goes down? That is very odd.
On wireshark, what you might see are some probe messages between <1.1.3>
and <1.1.2>. I would suspect messages only in one direction for a time
before one side gives up on the other. I'll have to dig up the format
of a probe message later (have to run to a meeting now).
Elmer
________________________________
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of
Kinahan, William P SIK
Sent: Friday, August 31, 2007 5:01 PM
To: [email protected]
Cc: Kinahan, William P SIK; Fimbers, Kevin SIK; Malo, Randy SIK
Subject: [tipc-discussion] Dropped Name Table during link reset.
I'm experiencing some odd behavior using TIPC on VxWorks 6.3 and I'm
hoping someone spots something familiar or can provide me with some
debug advice.
I've got an application that uses TIPC to keep a data set synchronized
across multiple processors in an embedded avionics application. In
general, I've found TIPC quite reliable and I'm happy with the
performance. I've recently encountered the following problem:
Node 1.1.2 is fully operational with my application running
Node 1.1.3 is running the shell and is in the process of being loaded
with my application
At or near the end of the load, the links reset for some unknown reason.
It appears that TIPC partially recovers and reestablishes the links. The
name table however is not reestablished on 1.1.3.
The following shell output shows the configuration before and after the
reset.
->
->
-> tipcConfig "-nt"
Type Lower Upper Port Identity Publication
-------------------------------- --------------------------
------------------
0 16781314 16781314 <1.1.2:1086734332> 1086734333
16781315 16781315 <1.1.3:1086734332> 1086734333
zone
1 1 1 <1.1.3:1086734333> 1086734334
node
3505 0 0 <1.1.2:1086734327> 1086734328
3506 0 0 <1.1.2:1086734324> 1086734325
value = 0 = 0x0
-> tipcConfig "-nt"
Type Lower Upper Port Identity Publication
-------------------------------- --------------------------
------------------
0 16781315 16781315 <1.1.3:1086734332> 1086734333
zone
1 1 1 <1.1.3:1086734333> 1086734334
node
value = 0 = 0x0
-> tipcConfig "-l -nt -n -log"
Links:
multicast-link: up
1.1.3:wancom0-1.1.2:wancom0: up
1.1.3:wancom1-1.1.2:wancom1: up
Type Lower Upper Port Identity Publication
-------------------------------- --------------------------
------------------
0 16781315 16781315 <1.1.3:1086734332> 1086734333
zone
1 1 1 <1.1.3:1086734333> 1086734334
node
Nodes known:
<1.1.2>: up
Log dump:
very domain <1.1.0>
TIPC info: Enabled bearer <eth:wancom1>, discovery domain <1.1.0>
TIPC info: Established link <1.1.3:wancom0-1.1.2:wancom0> on network
plane A
TIPC info: Established link <1.1.3:wancom1-1.1.2:wancom1> on network
plane B
TIPC info: Resetting link <1.1.3:wancom0-1.1.2:wancom0>, requested by
peer
TIPC info: Lost link <1.1.3:wancom0-1.1.2:wancom0> on network plane A
TIPC warning: Link changeover error, peer did not permit changeover
TIPC info: Established link <1.1.3:wancom0-1.1.2:wancom0> on network
plane A
TIPC info: Resetting link <1.1.3:wancom0-1.1.2:wancom0>, changeover
initiated by peer
TIPC info: Lost link <1.1.3:wancom0-1.1.2:wancom0> on network plane A
TIPC info: Resetting link <1.1.3:wancom1-1.1.2:wancom1>, requested by
peer
TIPC info: Lost link <1.1.3:wancom1-1.1.2:wancom1> on network plane B
TIPC info: Lost contact with <1.1.2>
TIPC info: Established link <1.1.3:wancom1-1.1.2:wancom1> on network
plane B
TIPC info: Established link <1.1.3:wancom0-1.1.2:wancom0> on network
plane A
value = 0 = 0x0
Some other oddities:
- The above output was captured from the shell using a serial port
connection. If I have the target console attached from workbench, I
don't experience this problem. This is true even when the console window
is idle (I'm not typing or executing commnds)
- I do not experience the problem if I disconnect one of the two
ethernet connections (e.g. bearer wancom1)
The application I'm loading is rather large (22 MB). Is there any
obvious reason the link might want to be reset? Are there any timeout
parameters I might want to consider modifying?
I do have wireshark recordings of both ethernet interfaces at the time
of the failure if someone has suggestions regarding what to look for.
Thanks in advance for any assistance that may be provided.
Bill Kinahan
Chief Software Architect
Sikorsky Aircraft
(203)386-3551
Fax (860)998-5575
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
tipc-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/tipc-discussion