Tue, Nov 01, 2016 at 04:36:50PM CET, da...@davemloft.net wrote:
>From: Roopa Prabhu <ro...@cumulusnetworks.com>
>Date: Tue, 01 Nov 2016 08:14:14 -0700
>
>> On 11/1/16, 7:19 AM, Eric Dumazet wrote:
>>> On Tue, 2016-11-01 at 00:57 +0200, Ido Schimmel wrote:
>>>> On Mon, Oct 31, 2016 at 02:24:06PM -0700, Eric Dumazet wrote:
>>>>> How well will this work for large FIB tables ?
>>>>>
>>>>> Holding rtnl while sending thousands of skb will prevent consumers to
>>>>> make progress ?
>>>> Can you please clarify what do you mean by "while sending thousands of
>>>> skb"? This patch doesn't generate notifications to user space, but
>>>> instead invokes notification routines inside the kernel. I probably
>>>> misunderstood you.
>>>>
>>>> Are you suggesting this be done using RCU instead? Well, there are a
>>>> couple of reasons why I took RTNL here:
>>>>
>>> No, I do not believe RCU is wanted here, in control path where we might
>>> sleep anyway.
>>>
>>>> 1) The FIB notification chain is blocking, so listeners are expected to
>>>> be able to sleep. This isn't possible if we use RCU. Note that this
>>>> chain is mainly useful for drivers that reflect the FIB table into a
>>>> capable device and hardware operations usually involve sleeping.
>>>>
>>>> 2) The insertion of a single route is done with RTNL held. I didn't want
>>>> to differentiate between both cases. This property is really useful for
>>>> listeners, as they don't need to worry about locking in writer-side.
>>>> Access to data structs is serialized by RTNL.
>>> My concern was that for large iterations, you might hold RTNL and/or
>>> current cpu for hundred of ms or even seconds...
>>>
>> I have the same concern as Eric here.
>> 
>> I understand why you need it, but can the driver request for an initial dump 
>> and that
>> dump be made more efficient somehow ie not hold rtnl for the whole dump ?.
>> instead of making the fib notifier registration code doing it.
>> 
>> these routing table sizes can be huge and an analogy for this in user-space:
>> We do request a netlink dump of  routing tables at initialization (on driver 
>> starts or resets)...
>> but, existing netlink routing table dumps for that scale don't hold rtnl for 
>> the whole dump.
>> The dump is split into multiple responses to the user and hence it does not 
>> starve other rtnl users.
>> 
>> In-fact I don't think netlink routing table dumps from user-space hold 
>> rtnl_lock for the whole dump.
>> IIRC this was done to allow route add/dels to be allowed in parallel for 
>> performance reasons.
>> (I will need to double check to confirm this).
>
>I've always had some reservations about using notifiers for getting
>the FIB entries down to the offloaded device.

Yeah, me too. But there is no really other way to do it. I thought about
it quite long. But maybe I missed something.


>
>And this problem is just another symptom that it is the wrong
>mechanism for propagating this information.
>
>As suggested by Roopa here, perhaps we're looking at the problem from
>the wrong direction.  We tend to design NDO ops and notifiers, to
>"push" things to the driver, but maybe something more like a push+pull
>model is better.

How do you imagine this mode should looks like? Could you draw me some
example?

Thanks!

Reply via email to