Hi! On 9/19/07, Nayman Felix-QA5535 <[EMAIL PROTECTED]> wrote: > > I'm running TIPC 1.5.12 on a linux 2.6.9 kernel on two different nodes and > I'm seeing a problem with link congestion after about 26 messages being > sent. I'm running connectionless traffic ( also with the socket set to > non-blocking via fcntl), with the domain set to closest first, the > destination droppable flag set to FALSE, the message importance set to HIGH, > and the message size set to 2000 bytes.
Your MTU is probably 1500 B so TIPC will have to split each message thereby sending 2 packets. After the 25th user-space send, you have 50 items in the link queue (if you have a fast cpu and and ack message from the remote tipc hasn't arrived yet.) Since your link window is 50, the 26th send fails, and doesn't block. You retry getting EAGAIN until the far tipc node sends an ack, opening up room in the link queue. Yes it really takes that long! If you measure it it will likely only be ~100 us so long is a relative term. The only hitch is that you say you have set the message priority to HIGH so the link congestion should be 50/3*5 as can be seen here: <http://lxr.linux.no/source/net/tipc/link.c?v=2.6.17.13#L1037> and here: <http://lxr.linux.no/source/net/tipc/link.c?v=2.6.17.13#L2710> (tipc in 2.6.17.x is very close to tipc-1.5.12 and I don't have the 1.5.12 code handy) > > When I run the program, which is a modified version of the hello world > program, with the server running on one node and the client running on a > different node I'm getting back the error: Resource temporarily unavailable > and an errno of 11(EAGAIN) after the 26th message. So, I' updated my code > to retry if an errno of EAGAIN is returned, but I need to retry some value > which was more than 2500 (I believe it was around 2650 or something like > that) times before it successfully can send 27 messages over the link. Once the acks start coming back you should get fewer EAGAINs in a row because of the way the protocol works... > > > tipc-config -ls shows the following indicating that link congestion is > happening: > > Link <1.1.169:eth0-1.1.168:eth0> > ACTIVE MTU:1500 Priority:10 Tolerance:1500 ms Window:50 packets > RX packets:29 fragments:0/0 bundles:0/0 > TX packets:3627 fragments:3620/1809 bundles:0/0 > TX profile sample:105 packets average:1356 octets > 0-64:7% -256:0% -1024:40% -4096:53% -16354:0% -32768:0% -66000:0% > RX states:2175142 probes:1087411 naks:0 defs:0 dups:0 > TX states:2174945 probes:1087534 naks:0 acks:0 dups:0 > Congestion bearer:0 link:36 Send queue max:36 avg:0 > > > Any ideas as to why I'm seeing link congestion? Ummm, You're code is sending too fast and you have asked to not block... the network round trip time is longer than the time it takes to enqueue 25 packets... ;-) > If you'd like I can attach some sample code. > > > Just before sending this note, I tried commenting out the code that makes > the socket non-blocking via fcntl and then I don't see link congestion > anymore. So why does making the socket non-blocking lead to link > congestion? If blocking is allowed, the the calling process is put to sleep briefly until the link congestion is abated: http://lxr.linux.no/source/net/tipc/socket.c?v=2.6.17.13#L524 So you have to keep trying to send, hopefully also doing other useful work while you wait for the ack. Assuming that this is correct, then I'd like to suggest that tipc could notify userspace that the link is no longer congested... I've done things like that before but all in userspace. I'll explain at length if anyone is interested in the details... Given that congestion abatement notification doesn't exist yet, you have to: 0. live with it, 1. put in some flow control on the send side, 2. increase the link send window, 3. both 1 & 2. Another question that comes to mind is: What is the maximum time that a process be suspended? I'd guess it would be the (default) link timeout time of ~1.5 s. That would only occur if the far end failed. -- // Randy ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ _______________________________________________ tipc-discussion mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/tipc-discussion
