On 9/6/24 03:25, Jakub Kicinski wrote:
For the driver -- let me flip the question around -- what do you expect
the locking scheme to be in case of channel count change? Alternatively
we could just expect the driver to take netdev->lock around the
appropriate section of code and we'd do:

void net_shaper_set_real_num_tx_queues(struct net_device *dev, ...)
{
        ...
        if (!READ_ONCE(dev->net_shaper_hierarchy))
                return;

        lockdep_assert_held(dev->lock);
        ...
}

In the IAVF case that will be problematic, as AFAICS the channel reconf is done by 2 consecutive async task, the first task - iavf_reset_task - changes the actual number of channels freeing/allocating the driver resources and the 2nd one - iavf_finish_config - notify the stack issuing netif_set_real_num_tx_queues(). iavf_reset_task can't easily wait for iavf_finish_config due to locking order.

I had a look at iavf, and there is no relevant locking around the queue
count check at all, so that doesn't help.

Yep, that is racy.

Acquiring dev->lock around set_channel() will not be enough: some driver
change the channels number i.e. when enabling XDP.

Indeed, trying to lock before calling the driver would be both a huge
job and destined to fail.

I think/fear we need to replace the dev->lock with the rtnl lock to
solve the race for good.

Maybe :( I think we need *an* answer for:
  - how we expect the driver to protect itself (assuming that the racy
    check in iavf_verify_handle() actually serves some purpose, which
    may not be true);
  - how we ensure consistency of core state (no shapers for queues which
    don't exist, assuming we agree having shapers for queues which
    don't exist is counter productive).

I agree we must delete shapers on removed/deleted queues. The driver/firmware could reuse the same H/W resources for a different VF and such queue must start in the new VF with a default (no shaping) config.

Reverting back to rtnl_lock for all would be sad, the scheme of
expecting the driver to take netdev->lock could work?
It's the model we effectively settled on in devlink.
Core->driver callbacks are always locked by the core,
for driver->core calls driver should explicitly take the lock
(some wrappers for lock+op+unlock are provided).

I think/guess/hope the following could work:

- the driver wraps the h/w resources reconfiguration and netif_set_real_num_tx_queues() with dev->lock. In the iavf case, that means 2 separate critical sections: in iavf_reset_task() and in iavf_finish_config().

- the core, under dev->lock, checks vs real_num_tx_queues and call the shaper ops

- the iavf shaper callbacks would still need to check the queue id vs the current allocated hw resource number as the shapers ops could run in-between the 2 mentioned critical sections. The iavf driver could still act consistently with the core:

  - if real_num_tx_queues < qid < current_allocated_hw_resources
    set the shaper,
  - if current_allocated_hw_resources < qid < real_num_tx_queues do
    nothing and return success

In both the above scenarios, real_num_tx_queues will be set to current_allocated_hw_resources soon by the upcoming iavf_finish_config(), the core will update the hierarchy accordingly, the status will be consistent.

I think the code should be more clear, let me try to share it ASAP (or please block me soon ;)

Thanks,

Paolo

Reply via email to