Re: Sysctl knob(s) to set TCP 'nagle' time-out?

2008-06-24 Thread Jerahmy Pocott


On 24/06/2008, at 2:42 AM, Matthew Dillon wrote:


 It should be noted that Nagle can cause high latencies even when
   delayed acks are turned off.  Nagle's delay is not timed... in its
   simplest description it prevents packets from being transmitted
   for new data coming from userland if the data already in the
   sockbuf (and presumably already transmitted) has not yet been
   acknowledged.


Assuming that a full data packet can't be constructed in the time it
takes for the acknowledgement. If you CAN construct a whole packet
in that time then Nagle is either doing a good job or you're sending
large amounts of data..

Perhaps nagle a) needs a time out, though I don't really think that
would help, or b) uses a dynamic 'in-flight' count where it tries to
maintain x packets in-flight and only holds packets up when that
value is reached.. The idea being that you get the ack on your first
packet at the same time as the host should be getting your second
packet..

That way you still get to concatenate lots of small packets being
generated in a short space of time, but don't hold up sending data
because of the ack latency. It should also be possible to detect if
the remote host is using delayed acks and compensate for that?

Though I'v not considered it in much detail..
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Sysctl knob(s) to set TCP 'nagle' time-out?

2008-06-23 Thread Matthew Dillon

:Hi,
:
:I'm wondering if anything exists to set this.. When you create an INET  
:socket
:without the 'TCP_NODELAY' flag the network layer does 'naggling' on your
:transmitted data. Sometimes with hosts that use Delayed_ACK  
:(net.inet.tcp.
:delayed_ack) it creates a dead-lock where the host will not ACK until  
:it gets
:another packet and the client will not send another packet until it  
:gets an ACK..
:
:The dead-lock gets broken by a time-out, which I think is around 200ms?
:
:But I would like to change that time-out if possible to something  
:lower, yet
:I can't really see any sysctl knobs that have a name that suggests  
:they do
:that..
:
:So does anyone know IF this can be tuned and if so by what?
:
:Cheers,
:Jerahmy.
:
:(And yes you could solve it by setting the TCP_NODELAY flag on the  
:socket,
:but not everything has programmed in options to set it and you don't  
:always
:have access to the source, besides setting a sysctl value would be much
:simpler than recompiling stuff)

There is a sysctl which adjusts the delayed-ack timing, its
called net.inet.tcp.delacktime.  The default is 1/10 of a second
(100 == 100 ms = 1/10 of a second).

BUT, it shouldn't be possible for nagle to deadlock against delayed acks
unless the TCP implementation is broken somehow.  A delayed ack is
simply that... the ack is delayed 100 ms in order to improve its
chances of being piggy-backed on return data.  The ack is not blocked
completely, just delayed, and certain events (such as the receiving
end turning around and sending data back, which is typical for an
interactive connection)... certain events will cause the delayed ack
to be aborted and for the ack to be immediately sent with the return data.

Can it break down and cause excessive lag?  Yes, it can.  Interactive
games almost universally have to disable Nagle because the lag is
actually due to the data relay from client 1 - server then relaying
the interactive event to client 2.  Without an immediate interactive
response to client 1 the ack gets delayed and the next event from 
client 1 hits Nagle and stops dead in the water until the first event
reaches client 2 and client 2 reacts to it (then client 2 - server - 
(abort delayed ack and send) - client 1 (client 1's nagle now allows
the second event to be transmitted).  That isn't a deadlock, just 
really poor interactive performance in that particular situation.

Delayed acks also have a safety valve.  The spec says that an ack
cannot be delayed more then two packets.  In a batch link when the
second (unacked) packet is received, the delayed ack is aborted and
an ack is immediately returned to the sender.  This is to prevent
congestion control (which is based on acks) from getting completely
out of whack and also to prevent the TCP window from getting exhausted.

In anycase, the usual solution is to disable Nagle rather then mess
with delayed acks.  What we need is a new Nagle that understands the
new reality for interactive connections... something that doesn't break
performance in the 'server in the middle' data relaying case.

-Matt

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Sysctl knob(s) to set TCP 'nagle' time-out?

2008-06-23 Thread David Malone
On Mon, Jun 23, 2008 at 05:25:49PM +1000, Jerahmy Pocott wrote:
 So does anyone know IF this can be tuned and if so by what?

You can tune it with net.inet.tcp.delacktime - it should be is ms.

David.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Sysctl knob(s) to set TCP 'nagle' time-out?

2008-06-23 Thread Jerahmy Pocott


On 23/06/2008, at 6:27 PM, Matthew Dillon wrote:

   Can it break down and cause excessive lag?  Yes, it can.   
Interactive



   games almost universally have to disable Nagle because the lag is
   actually due to the data relay from client 1 - server then  
relaying
   the interactive event to client 2.  Without an immediate  
interactive

   response to client 1 the ack gets delayed and the next event from
   client 1 hits Nagle and stops dead in the water until the first  
event
   reaches client 2 and client 2 reacts to it (then client 2 -  
server -
   (abort delayed ack and send) - client 1 (client 1's nagle now  
allows

   the second event to be transmitted).  That isn't a deadlock, just
   really poor interactive performance in that particular situation.


Yeah, that's what I'm talking about.

True, it's not really a dead-lock, but it's terribly slow! The  
interaction can

cause a 200ms delay on a LAN, as can be seen with samba if you disable
tcp_nodelay..



   In anycase, the usual solution is to disable Nagle rather then mess
   with delayed acks.  What we need is a new Nagle that understands  
the
   new reality for interactive connections... something that doesn't  
break

   performance in the 'server in the middle' data relaying case.



Exactly, there is nothing really wrong with delayed acks.. But with  
sysctl

I CAN disable and mess with the delayed acks, but I can't seem to do
anything to Nagle.

That's why I was thinking if I could change the Nagle time-out to 0ms it
would effectively disable it..

Cheers.
J.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Sysctl knob(s) to set TCP 'nagle' time-out?

2008-06-23 Thread Jerahmy Pocott


On 23/06/2008, at 7:00 PM, David Malone wrote:


On Mon, Jun 23, 2008 at 05:25:49PM +1000, Jerahmy Pocott wrote:

So does anyone know IF this can be tuned and if so by what?


You can tune it with net.inet.tcp.delacktime - it should be is ms.


Yeah I saw that one. But that only changes the delayed ack...

The default value of 100ms seems fairly reasonable unless you're
talking about a LAN..

I guess what I really want to do is disable Nagle in the tcp stack, but
since you do that with the sockopts call on a per socket basis I'm
guessing there isn't any system wide tunable for it..

Thanks,
Jerahmy.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Sysctl knob(s) to set TCP 'nagle' time-out?

2008-06-23 Thread Stefan Eßer

Matthew Dillon wrote:

In anycase, the usual solution is to disable Nagle rather then mess
with delayed acks.  What we need is a new Nagle that understands the
new reality for interactive connections... something that doesn't break
performance in the 'server in the middle' data relaying case.


One possibility I see is a statistic about DelACKs per TCP connection,
counting those that were rightfully delayed (with hindsight). I.e.,
if an ACK is delayed, but there was no chance to piggy-back it or to
combine it with another ACK, it could have been sent without delay.
Only those delayed ACKs that reduce load are good, all others cause
additional state to be maintained and may increase latencies for no
good reason.

Therefore, I thought about starting with Nagle enabled, but give up
on delaying ACKs, when doing so is found to be ineffective.

The only problem with this approach is that once TCP_NODELAY is
implicitly set due to measured behavior of the communication, a
situation that would benefit from delayed ACKs can no longer be
detected. (Well, you could measure the delay between an ACK and
the next data sent to the same destination; disable TCP_NODELAY
if ACKs could have been piggy-backed on data packets without too
much delay. May be we could really have TCP auto-tune with respect
to use of delayed ACKs ...

I had suggested this years back, when the issue was discussed, but
consensus was, that you should just set TCP_NODELAY. But automatic
adjustment could also (implicitly) take RTT, window size into
consideration. And to me, automatic setting of TCP_NODELAY seems
more useful than automatic clearing (after delayed ACKs had been
found to be of no use for a window of say 8 or 16 ACKs).

The implementation would be quite simple: Whenever a delayed ACK
is sent, check whether it is sent on its own (bad) or whether it
could be piggy-backed (good). If, say, 7 of 8 delayed ACKs had to
be sent as ACK-only packets, anyway, set TCP_NODELAY and do not
bother to keep on deciding whether delayed ACKs had become useful
in a different phase of the communication. If you want to be able
to automatically disable TCP_NODELAY, then just set a time-stamp
whenever an ACK is sent and when the next data is sent through
this same socket, check whether delaying the ACK had allowed to
send it with that data packet (i.e. the delay was less than the
maximum hold time of the delayed ACK). If it had been beneficial
to delay ACKs (say 3 out of a window of 4) then clear TCP_NODELAY.

I have no idea, whether SMP locking would be problematic, but I
guess the checks and counter updates could be put in sections
that are appropriately locked, anyway.

Regards, STefan
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Sysctl knob(s) to set TCP 'nagle' time-out?

2008-06-23 Thread Matthew Dillon

:One possibility I see is a statistic about DelACKs per TCP connection,
:counting those that were rightfully delayed (with hindsight). I.e.,
:if an ACK is delayed, but there was no chance to piggy-back it or to
:combine it with another ACK, it could have been sent without delay.
:Only those delayed ACKs that reduce load are good, all others cause
:additional state to be maintained and may increase latencies for no
:good reason.
:
:...
:consideration. And to me, automatic setting of TCP_NODELAY seems
:more useful than automatic clearing (after delayed ACKs had been
:found to be of no use for a window of say 8 or 16 ACKs).
:
:The implementation would be quite simple: Whenever a delayed ACK
:is sent, check whether it is sent on its own (bad) or whether it
:could be piggy-backed (good). If, say, 7 of 8 delayed ACKs had to
:be sent as ACK-only packets, anyway, set TCP_NODELAY and do not
:bother to keep on deciding whether delayed ACKs had become useful
:in a different phase of the communication. If you want to be able
:to automatically disable TCP_NODELAY, then just set a time-stamp
:...
:Regards, STefan

That's an interesting approach.  I think it would catch some
of the cases, but not enough of them.  If the round-trip in
the server-relaying case is less then the delayed-ack, the acks
will still wind up piggy-backed on return traffic but the latency
will also still remain horrible.

It should be noted that Nagle can cause high latencies even when
delayed acks are turned off.  Nagle's delay is not timed... in its
simplest description it prevents packets from being transmitted
for new data coming from userland if the data already in the
sockbuf (and presumably already transmitted) has not yet been
acknowledged.

For interactive traffic this means that Nagle is putting the screws
on the packet stream even if the acks aren't delayed, simply from the
ack latency.  With delayed acks turned off the latency is lower, but
not 0, so interactive traffic is still being held up by Nagle.  The
effect is noticeable even on a LAN.  Jerahmy brought up Samba... that
is an excellent example.  NFS-over-TCP would be another good example.

Any protocol which multiplexes multiple commands from different
sources over the same connection gets really messed up (slowed down)
by Nagle.

On the flip side, Nagle can't just be turned off by default because
it would cause streaming connections from user programs which do tiny
writes to generate a lot of unnecessarily tiny packets.  This can become
apparent when using SSH over a slow link.  Numerous programs run from
a shell generate fairly ineffcient packets which could have easily
been batched when operating over SSH.  The result can be sludgy
performance for output which ought be batched up by TCP but isn't because
SSH turns off Nagle unconditionally.

-Matt
Matthew Dillon 
[EMAIL PROTECTED]
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]