Rong,
Some comments inline. And one remaining question at the end...
On 28/03/17 02:04, Rong Pan (ropan) wrote:
Bob,
Sorry for the late reply. I have been traveling. Please see inlineŠ
Rong
On 3/23/17, 5:01 PM, "aqm on behalf of Bob Briscoe" <[email protected]
on behalf of [email protected]> wrote:
Rong, Preethi, Greg, Fred, and others involved in PIE,
You may recall that when we wrote PI2 we didn't include any of PIE's
heuristics. Mostly because PI2 solved the issues they addressed
intrinsically. But we left some until we had checked their benefit,
which is what I'm doing now...
My first question is about this heuristic in PIE:
//Safeguard PIE to be work conserving
if ( (PIE->qdelay_old_ < QDELAY_REF/2 && PIE->drop_prob_ < 0.2)
|| (queue_.byte_length() <= 2 * MEAN_PKTSIZE) ) {
return ENQUE;
}
If it tests true, this block doesn't stop the calculation of drop_prob_
evolving, but it disables it being able to lead to any random packet drop.
I can understand why you want to disable packet drop when the queue is
no more than 2 packets.
My question is about the first half of the logical OR. The drop_prob_ <
20% test will be true under normal non-overloaded conditions. So I have
just realized that the qdelay_old_ < QDELAY_REF/2 test will turn off
random drop very often. I would expect this to radically impact the
behaviour of PIE. It seems to be overriding the PI controller as if you
are thinking "actually we don't really trust the PI controller to leave
it to do its thing, so we've overridden it a lot of the time." For
instance, whenever a single long-running TCP flow with RTT about the
same as the target delay is saw-toothing, this test will disable random
drop completely during the lower half of every saw-tooth in the queue.
Maybe that's OK, but...
Without this test, the PI controller should reduce drop probability as
the queue sawtooths down anyway. If another flow causes the queue to
rise rapidly while it is under half the target, the PI controller is
designed to detect such an increase and translate it into drop. But this
heuristic suppresses any drop until the queue has exceeded half the
target.
So my questions are:
Q1. What were the reasons for introducing such a frequent suppression of
the PI algorithm (the RFC just says what this code does, not why)?
To be work conserving and avoid any unnecessary drops are the main reasons
behind it.
Cisco had a not so successful algorithm before that is not work
conserving. So we are
extra cautious about being work conserving...
[BB] There is only a work-conservation problem if drop_early() is
applied at enqueue. That's because, at enqueue, you don't yet know
whether another packet will arrive to take the place of the packet you
are deciding to drop.
We're shifting drop_early() to dequeue {Note 1}. So to be
work-conserving we can rely solely on the test on the other side of the
logical OR above that suppresses any drop if "backlog < 2 MTU". That's
the only heuristic that we are keeping so far, although I'm undecided
about the "< QDELAY_REF/2" test, which (as you say) might be beneficial
for other reasons than work conservation. But we have no tests that show
that yet.
{Note 1}: Because we're using sojourn time to measure the queue, so if
we were still dropping on enqueue, each congestion signal would be
delayed twice by the queue.
Q2. Why use qdelay_old_ in the test? This seems to drive suppression of
drop using stale state.
qdelay_old_ is the latency state currently stored. This is for
implementation
Considerations as we don¹t want to calculate qdelay_ on per packet basis.
[BB] Understood.
We're using sojourn time per packet for the shifted FIFO scheduler
anyway, so no extra cost.
Q3. Having said that it looks like this heuristic will significantly
alter PIE's behaviour, in tests under a very wide range of traffic
conditions, link rates, mixed RTTs, traffic models etc, we have found
that removing the heuristics makes no measurable difference to PIE's
performance. So if you added this heuristic for a specific scenario,
please describe it, so we can test for it.
Again, to be work conserving and avoid drops are our goal. I don¹t
think it would be hurtful to add those safeguards.
[BB]: So that begs just one remaining question:
Q: Do you have tests showing any benefit, specifically comparing with
and without this "< QDELAY_REF/2" heuristic?
Given the point of a (non-ECN) AQM is to introduce the right level of
random drops, it seems strange to suppress some of them with an
additional arbitrary rule.
Thanks for your replies so far tho. They have helped me realize more
reasons why PIE needs these heuristics, but PI2 might not.
Bob
Cheers
Bob
--
________________________________________________________________
Bob Briscoehttp://bobbriscoe.net/
_______________________________________________
aqm mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/aqm
_______________________________________________
aqm mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/aqm
--
________________________________________________________________
Bob Briscoe http://bobbriscoe.net/
_______________________________________________
aqm mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/aqm