> <---------------------------- SA GetTableResp > > RMPP flags 0x05 (Data, Last) > SegmentNumber 4 > PayloadLength 0x34 > TID 8 >SA GetTable ----------------------------> >RMPP flags 0x02 (ACK) >SegmentNumber 1 >NewWindowLast 6 >TID 8
This segment number is off - not sure why. It could indicate that segment 2 was lost, or that its processing came after that of a later segment. The RMPP code always updates its SegmentNumber in an ACK based on the last received packet that arrived in order. This ACK should be dropped by the SA side as a duplicate. The SA would then rely on a timeout to resend. Did you ever see an ACK for segment 4? Regardless what went wrong on the SA side, the client needs to be able to deal with it. > <---------------------------- SA GetTableResp > RMPP flags 0x01 (Data) > SegmentNumber 5 > PayloadLength 0x34 > TID 8 This should not occur. The maximum segment number sent should have stayed at 4. I guess one area to check is to make sure that the PayloadLength in the original MAD is set correctly. I do not know what would happen if it were set incorrectly. There could also be an error in how RMPP calculates the number of segments that will be sent. This segment should have been dropped by the client as an invalid segment number. >I presume the reACKing is used as a keep alive so a response timeout >(Resp) does not occur. The SA client is using RRespTime of 0xE. The >OpenIB side sets this field to 0 (not sure if this affects the SA client >side). RMPP is using hard-coded timeouts at this point. It would require path record information to calculate one that's more accurate. >Some questions on the RMPP sender side (SA): > >1. I wouldn't think that reACKing the same segment (1) by the receiver >(SA client end) would cause the sender side to send segment 5. Correct - nothing should cause the SA to send segment 5... >2. In the resend, the header (everything up to SA data) appears to be >good but the data appears to be garbage. This leads me to think that it is an issue with the setting or calculations based on the PayloadLength or sge size. I need to see if the code that calculates the total number of segments to send matches up with the code that determines if a segment should have the last bit set. >Have you tested RMPP retransmission ? Have you tested simultaneous >transactions in progress ? Both of these have been tested. I've forced failures at multiple points in the retransmissions and tested a few thousands simultaneous transactions, and I've never noticed a problem like what you're seeing. But that could very well could be because of limitations to my test program. I will spend a little time tomorrow afternoon looking at this... - Sean _______________________________________________ openib-general mailing list [email protected] http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
