On Tue, Jan 16, 2024 at 09:11:32PM +0800, Heng Qi wrote:
> Currently, when each time the driver attempts to update the coalescing
> parameters for a vq, it needs to kick the device.
> The following path is observed:
>   1. Driver kicks the device;
>   2. After the device receives the kick, CPU scheduling occurs and DMA
>      multiple buffers multiple times;
>   3. The device completes processing and replies with a response.
> 
> When large-queue devices issue multiple requests and kick the device
> frequently, this often interrupt the work of the device-side CPU.
> In addition, each vq request is processed separately, causing more
> delays for the CPU to wait for the DMA request to complete.
> 
> These interruptions and overhead will strain the CPU responsible for
> controlling the path of the DPU, especially in multi-device and
> large-queue scenarios.
> 
> To solve the above problems, we internally tried batch request,
> which merges requests from multiple queues and sends them at once.
> We conservatively tested 8 queue commands and sent them together.
> The DPU processing efficiency can be improved by 8 times, which
> greatly eases the DPU's support for multi-device and multi-queue DIM.
> 
> Suggested-by: Xiaoming Zhao <zxm377...@alibaba-inc.com>
> Signed-off-by: Heng Qi <hen...@linux.alibaba.com>

...

> @@ -3546,16 +3552,32 @@ static void virtnet_rx_dim_work(struct work_struct 
> *work)
>               update_moder = net_dim_get_rx_moderation(dim->mode, 
> dim->profile_ix);
>               if (update_moder.usec != rq->intr_coal.max_usecs ||
>                   update_moder.pkts != rq->intr_coal.max_packets) {
> -                     err = virtnet_send_rx_ctrl_coal_vq_cmd(vi, qnum,
> -                                                            
> update_moder.usec,
> -                                                            
> update_moder.pkts);
> -                     if (err)
> -                             pr_debug("%s: Failed to send dim parameters on 
> rxq%d\n",
> -                                      dev->name, qnum);
> -                     dim->state = DIM_START_MEASURE;
> +                     coal->coal_vqs[j].vqn = cpu_to_le16(rxq2vq(i));
> +                     coal->coal_vqs[j].coal.max_usecs = 
> cpu_to_le32(update_moder.usec);
> +                     coal->coal_vqs[j].coal.max_packets = 
> cpu_to_le32(update_moder.pkts);
> +                     rq->intr_coal.max_usecs = update_moder.usec;
> +                     rq->intr_coal.max_packets = update_moder.pkts;
> +                     j++;
>               }
>       }
>  
> +     if (!j)
> +             goto ret;
> +
> +     coal->num_entries = cpu_to_le32(j);
> +     sg_init_one(&sgs, coal, sizeof(struct virtnet_batch_coal) +
> +                 j * sizeof(struct virtio_net_ctrl_coal_vq));
> +     if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_NOTF_COAL,
> +                               VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET,
> +                               &sgs))
> +             dev_warn(&vi->vdev->dev, "Failed to add dim command\n.");
> +
> +     for (i = 0; i < j; i++) {
> +             rq = &vi->rq[(coal->coal_vqs[i].vqn) / 2];

Hi Heng Qi,

The type of .vqn is __le16, but here it is used as an
integer in host byte order. Perhaps this should be (completely untested!):

                rq = &vi->rq[le16_to_cpu(coal->coal_vqs[i].vqn) / 2];

> +             rq->dim.state = DIM_START_MEASURE;
> +     }
> +
> +ret:
>       rtnl_unlock();
>  }
>  

Reply via email to