From the device's perspective, during a single read of the descriptor 
list-and that list spans across cache lines.
the retrieved data will show `desc[head].flags` as valid, 
and `desc[i].flags` as valid as well; however, 
the `desc[i].addr` and `len` fields may be invalid.

I suspect this occurs because a single DMA operation is internally 
split into multiple parallel read transactions, 
aligned to cache line boundaries.

I apologize that I currently lack the necessary environment to verify 
whether this modification definitively resolves the issue or 
merely reduces the probability of its occurrence; 
therefore, this patch can be discarded.

yangjiale



At 2026-06-02 14:04:13, "Eugenio Perez Martin" <[email protected]> wrote:
>On Tue, Jun 2, 2026 at 6:34 AM yangjiale <[email protected]> wrote:
>>
>> When a descriptor list spans across cache lines,
>> updating the flag first can lead to a scenario where the device side
>> perceives the flag as valid, yet the corresponding address and length
>> fields remain unupdated—resulting in invalid values.
>> Therefore, the flag field must be updated last.
>>
>> Signed-off-by: yangjiale <[email protected]>
>> ---
>>  drivers/virtio/virtio_ring.c | 8 ++++----
>>  1 file changed, 4 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
>> index fbca7ce1c6bf..036b4f90d30f 100644
>> --- a/drivers/virtio/virtio_ring.c
>> +++ b/drivers/virtio/virtio_ring.c
>> @@ -1688,6 +1688,10 @@ static inline int virtqueue_add_packed(struct 
>> vring_virtqueue *vq,
>>                                              &addr, &len, premapped, attr))
>>                                 goto unmap_release;
>>
>> +                       desc[i].addr = cpu_to_le64(addr);
>> +                       desc[i].len = cpu_to_le32(len);
>> +                       desc[i].id = cpu_to_le16(id);
>> +
>>                         flags = cpu_to_le16(vq->packed.avail_used_flags |
>>                                     (++c == total_sg ? 0 : 
>> VRING_DESC_F_NEXT) |
>>                                     (n < out_sgs ? 0 : VRING_DESC_F_WRITE));
>> @@ -1696,10 +1700,6 @@ static inline int virtqueue_add_packed(struct 
>> vring_virtqueue *vq,
>>                         else
>>                                 desc[i].flags = flags;
>>
>> -                       desc[i].addr = cpu_to_le64(addr);
>> -                       desc[i].len = cpu_to_le32(len);
>> -                       desc[i].id = cpu_to_le16(id);
>> -
>>                         if (unlikely(vq->use_map_api)) {
>>                                 vq->packed.desc_extra[curr].addr = premapped 
>> ?
>>                                         DMA_MAPPING_ERROR : addr;
>
>These flags are updated before the flags of the head descriptor at the
>end of the function, at "vq->packed.vring.desc[head].flags =
>head_flags", so the device should not see these. Because of that, the
>relative order between the rest of the fields of the same descriptor
>or other descriptors' fields, except for the head descriptor's flags,
>should not matter. There is a write memory barrier just before
>updating the head's flags.
>
>Also, I don't get why the cache line matters here. Can you expand? Am
>I missing something?

Reply via email to