>From TIPC Programmer's Guide:

"If a link endpoint's transmit queue grows too large because the peer link 
endpoint falls behind in acknowledging the successful arrival of messages 
(typically around 50 messages), TIPC declares "link congestion" on that link. 
When a link becomes congested, the link only accepts a new message for 
transmission if it is important enough (i.e. the more important the message, 
the longer the queue is allowed to be)."

After digging in the source of 2.6.33 (not documented) I can add how importance 
relates to statement in the parenthesis above:

LOW imp => link window, default 50
MEDIUM imp => window * 4/3, ~133%
HIGH imp => window * 5/3, ~166%
CRITICAL imp => window * 6/3, 200%

OpenSAF currently uses the default LOW importance which gives a window of 50 
unacked messages. Any burst longer than will return EAGAIN for sendmsg()

Increasing importance to HIGH gives a window of 83. Increasing the TIPC link 
window to 100 makes HIGH get a window of 166.


"Whenever a message cannot be sent because of link congestion, TIPC checks the 
"source droppable" setting of the sending port. If the setting is enabled 
(indicating that the message is being sent in an unreliable manner) TIPC 
discards the message, but provides no indication of this to the sender. If the 
source droppable setting is disabled (which is the default case), TIPC will 
normally block the sending application until the congestion clears, and then 
resume the send operation; however, if the application has requested a non- 
blocking send, the application will not block when link congestion occurs and 
the send operation returns a failure indication. "

Comment: OpenSAF uses non blocking RDM socket thus sendmsg will return EAGAIN.

Increasing link window, increasing importance and send retries just reduces the 
likelihood of failed sends. In the end system tuning is needed to stay on the 
safe side.



---

** [tickets:#641] MDS resend**

**Status:** unassigned
**Created:** Thu Nov 28, 2013 09:40 AM UTC by Hans Feldt
**Last Updated:** Thu Nov 28, 2013 02:28 PM UTC
**Owner:** nobody

Occasionally we see TIPC link congestion at the sending node. This can be seen 
with "tipc-config -ls" as in:

Link <1.1.1:eth0-1.1.2:eth0>
  ACTIVE  MTU:1500  Priority:10  Tolerance:1500 ms  Window:100 packets
  RX packets:1877 fragments:0/0 bundles:0/0
  TX packets:17511 fragments:0/0 bundles:0/0
  TX profile sample:489 packets  average:151 octets
  0-64:0% -256:97% -1024:0% -4096:3% -16354:0% -32768:0% -66000:0%
  RX states:12974 probes:5918 naks:0 defs:0 dups:0
  TX states:12148 probes:6230 naks:0 acks:0 dups:0
  Congestion bearer:0 link:12  Send queue max:81 avg:0

Above is from testing with tipc-pipe. With the default link window at 50 I get 
one send EAGAIN when sending 100 msgs. When I increase link window to 100 I can 
burst 100 msgs without link congestion. If I then burst 1000 msgs I see lots of 
link congestion.


If we transfer this to opensaf it means a failed MDS send. MDS does not do any 
resends, neither does any service (IMMs FEVS not considered)

AMFd for example could loose a message to a node director which is very bad.

What I suggest is that messages should be resent, typically a loop with 3 
retries with a 100ms sleep in between.

There are some concerns, like:
* Can we put this in MDS or should it be done per service?
* What about MDS/TCP, same problem there?


---

Sent from sourceforge.net because [email protected] is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.
------------------------------------------------------------------------------
Rapidly troubleshoot problems before they affect your business. Most IT 
organizations don't have a clear picture of how application performance 
affects their revenue. With AppDynamics, you get 100% visibility into your 
Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro!
http://pubads.g.doubleclick.net/gampad/clk?id=84349351&iu=/4140/ostg.clktrk
_______________________________________________
Opensaf-tickets mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

Reply via email to