On 13/04/18 16:03, David Miller wrote:
> Whilst you may not be able to program the filter into the hardware
> synchronously, you should be able to allocate the ID and get all of
> the software state setup.
That's what we were doing before commit 3af0f34290f6 ("sfc: replace
asynchronous filter operations"), and (as mentioned in that commit)
that leads to (or at least the scheme we used had) race conditions
which I could not see a way to fix. If the hardware resets (and
thus forgets all its filters) after we've done the software state
setup but before we reach the point of finalising the software state
after the hardware operation, we don't know what operations we need
to do to re-apply the software state to the hardware, because we
don't know whether the reset happened before or after the hardware
Actually, it's not the insertion itself that this can happen to - if
the reset happens first then the insert will fail because other
hardware state we reference isn't set up yet. However, inserting
one filter can necessitate removing some others (lower-priority
multicast filters for the same flow); the code before 3af0f34290f6
was actually capable of getting so confused that it could double-
free a pointer.
This all gets sufficiently complicated that even if I can find a
locking scheme that in theory gets it right, there's pretty much a
100% chance that the actual implementation will contain bugs,
probably very subtle ones that can't reliably be reproduced etc.
All my instincts say to get away from that if at all possible.
> People really aren't going to be happy if their performance goes into
> the tank because their filter update rate, for whatever reason, hits
> this magic backlog limit.
Well, the alternative, even if the software state setup part _could_
be made synchronous, is to allow a potentially unbounded queue for
the hardware update part (I think there are even still cases in
which the exponential growth pathology is possible), causing the
filter insertions to be delayed an arbitrarily long time. Either
the flow is still going by that time (in which case the backlog
limit approach will get a new ndo_rx_flow_steer request and insert
the filter too) or it isn't, in which case getting round to it
eventually is no better than dropping it immediately. In fact it's
worse because now you waste time inserting a useless filter which
delays new requests even more.
Besides, I'm fairly confident that the only cases in which you'll
even come close to hitting the limit are ones where ARFS wouldn't
do you much good anyway, such as:
* Misconfigured interrupt affinities where ARFS is entirely pointless
* Many short-lived flows (which greatly diminish the utility of ARFS)
So for multiple reasons, hitting the limit won't actually make
performance worse, although it will often be a sign that performance
will be bad for other reasons.