> I've seen those hangs with 1.3.84, 1.3.85, 1.4.0rc1 and rc5 clients on > Linux (Kernel 2.6). 1.3.80 and 1.3.82 work fine, so I expect that some > change between 1.3.82 and 1.3.84 causes the problems.
I have looked at the diff between 82 and 84, and there is major changes in rx which are a bit to big for me to get hold on (lots of queues here and there). I have not found a way to get a grip on all the queues and connection flags that are used in rx. > The fileserver is > from transarc: > # rxdebug c-hoernchen -version > Trying 137.208.3.48 (port 7000): > AFS version: Base configuration afs3.4 5.77 That is not very - uhm - recent. (c-hoernchen: I was not aware that there were other related chipmunks beyond Chip and Dale [A-Hörnchen und B-Hörnchen] [piff och puff] :-) > To track down the problem, I've captured the network traffic between > client and server while creating 10 files with 100k each. Was the capture done on the client or the server? I've looked and looked and now my eyes are crossed. I have found some things: 1. openafs-1.3.84-slow.pcap frame 4 has a fetch data with Length 999999999. 2. openafs-1.3.82-fast frame 75-78 is a store data for file f_5. This seems to be part of call 5474 spanning 2 IP packets with 4 fragments each. This shows how it should look. openafs-1.3.84-slow call 256 frame 84-85 is the corresponding one. But where are fragments 3 and 4? They should be in the following frames within milliseconds. Then call 256 stalls completely for a long time until it is finnished in frame 96. I suspect major fishyness in the code that assembles and resends rx packets. I'd like to hear more about the changes to rx that were made between 82 and 84, what was the intended outcome? > I've also noticed that in versions 1.3.80 and 1.3.82 (those that do not > show the delays) each store-data UDP-packet is 5700 bytes and is > splitted in four UDP fragments. However, this is also true for 1.3.84, > which already shows the problems. In 1.4.0rcX, the store-data packets > seem to be smaller, the UDP packet is only 2896 bytes and comes in two > fragments. Is there any specific reason why all those packets are larger > than the MTU? I don't know anything about the change to 1.4.0rcX, but the 4 fragments are a "feature" of rx. Has something changed how rx fragments are handled in 1.4.0? There are 2 ways in which rx tries to reduce overhead. It may or may not be effective. 1. It puts more than one rx packet into an IP packet. I think that's called jumboframe. I think that feature is handshaked between client and server and as all my servers have -nojumbo I don't get such packets. 2. It gerenates IP packets up to 4 times MTU, according to a RX_MAX_FRAG in src/rx/rx_globals.h. I usually (when I don't forget it) patch that to 1. I think this comes from the times of the Sun SS10 or earlier when it was faster to send ONE IP packet with FOUR fragments instead of FOUR unfragmented IP packets. IMHO (*) this is bull today as your throuhput is devastated if you combine this scheme with packet loss. A packet loss of say 10% is multiplied to at least 40% because of all resends and resends of resends. Todays computers are way faster in making IP packets than a SS10. > Any help would be greatly appreciated! Sorry I can't help more. Harald. (*) Not necessary so humble at all times ;-) _______________________________________________ OpenAFS-devel mailing list [email protected] https://lists.openafs.org/mailman/listinfo/openafs-devel
