At $WORK, we're working on adding support for high-precision RTT
calculations in TCP.  The goal is reduce the retransmission timeout
significantly to help mitigate the impact of TCP incast.  This means that
the retransmit callout for TCP sockets gets scheduled significantly more
often with a shorter timeout period, but in the normal case it is expected
to be canceled or rescheduled before it times out.

What I have noticed is that when the retransmit callout is canceled or
rescheduled, the callout subsystem will not reschedule its currently
pending interrupt.  The result is that my system takes a significant number
of "spurious" timer interrupts where there are no callouts to service,
which is having a significant performance impact.

Unfortunately, neither the callout subsystem nor the eventtimers subsystem
really seem to be designed for canceling interrupts.  It's not easy to find
the "next" event in the callout wheel and the current code doesn't even try
when handling an interrupt; the next interrupt is scheduled at a seemingly
arbitrary point in the future.

I know that when the callout system was reworked the callout wheel data
structure was maintained to keep insertion and deletion O(1).  However I
question whether that was the right decision given the fact that if
callouts are frequently deleted, as in my case, we incur the signficant
overhead of a spurious timer interrupt.  Does anybody know if actual
performance measurements were taken to justify this decision?
_______________________________________________ mailing list
To unsubscribe, send any mail to ""

Reply via email to