Re: SACK scoreboard

John Heffner Tue, 08 Jan 2008 08:54:54 -0800

David Miller wrote:

Ilpo, just trying to keep an old conversation from dying off.


Did you happen to read a recent blog posting of mine?

        http://vger.kernel.org/~davem/cgi-bin/blog.cgi/2007/12/31#tcp_overhead

I've been thinking more and more and I think we might be able
to get away with enforcing that SACKs are always increasing in
coverage.

I doubt there are any real systems out there that drop out of order
packets that are properly formed and are in window, even though the
SACK specification (foolishly, in my opinion) allows this.

If we could free packets as SACK blocks cover them, all the problems
go away.

For one thing, this will allow the retransmit queue liberation during
loss recovery to be spread out over the event, instead of batched up
like crazy to the point where the cumulative ACK finally moves and
releases an entire window's worth of data.

Next, it would simplify all of this scanning code trying to figure out
which holes to fill during recovery.

And for SACK scoreboard marking, the RB trie would become very nearly
unecessary as far as I can tell.

I would not even entertain this kind of crazy idea unless I thought
the fundamental complexity simplification payback was enormous.  And
in this case I think it is.

What we could do is put some experimental hack in there for developers
to start playing with, which would enforce that SACKs always increase
in coverage.  If violated the connection reset and a verbose log
message is logged so we can analyze any cases that occur.

Sounds crazy, but maybe has potential.  What do you think?

Linux has a code path where this can happen under memory over-commit, intcp_prune_queue(). Also, I think one of the motivations for making SACKstrictly advisory is there was some concern about buggy SACKimplementations. Keeping data in your retransmit queue allows you tofall back to timeout and go-back-n if things completely fall apart. Forbetter or worse, we have to deal with the spec the way it is.

Even if you made this assumption of "hard" SACKs, you still have toworry about large ACKs if SACK is disabled, though I guess you could saypeople running with large windows without SACK deserve what they get. :)

I haven't thought about this too hard, but can we approximate this bymoving scaked data into a sacked queue, then if something bad happensmerge this back into the retransmit queue? The code will have to dealwith non-contiguous data in the retransmit queue; I'm not sure offhandif that violates any assumptions. You still have a single expensive ACKat the end of recovery, though I wonder how much this really hurts. Ifyou want to ameliorate this, you could save this sacked queue to bebatch processed later, in application context for instance.


  -John


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: SACK scoreboard

Reply via email to