On 19/09/2016 19:15, Michael S. Tsirkin wrote: > On Mon, Sep 19, 2016 at 02:01:57PM +0200, Paolo Bonzini wrote: >> The main issue with VRING_DESC_F_INDIRECT, at least as it is implemented >> in Linux, is the cost of allocating and freeing the indirect descriptor. >> But if you know that the number of descriptor is fixed (e.g. 2 for >> request header + data) you can pre-allocate the descriptors with a 1:1 >> relationship with the entries of the ring buffer. This might be useful >> for the virtio-net tx ring. > > Yes, I've been thinking the same thing. Is there any advantage > to have the indirect addresses pre-programmed as > opposed to stored in the descriptors? > > In theory > if (desc.flags & indirect) > use(*address[i]) > > might be speculated as compared to > if (desc.flags & indirect) > use(*desc.addr) > which can't. > But OTOH this might add to cache pressure as desc is already in cache > anyway.
Yes, I think this would be the worst of both worlds. :) If the driver knows it has 2 descriptors per request, it could allocate a big block in advance and do for example struct { // pre-allocated and pre-initialized // to point to ->hdr struct virtio_desc req_desc; // only flags prefilled struct virtio_desc data_desc; struct req_header hdr; } *indirect_data; i = index++ & (size - 1); indirect_addr = &indirect_data[i]; indirect_data[i]->data_desc.addr = this_data_addr; indirect_data[i]->data_desc.len = this_data_len; // ... fill in indirect_data[i]->hdr ... desc[i].addr = indirect_addr; smp_wmb(); desc[i].index = i | DESC_HW; where there is no need for an array of descriptor addresses, &indirect_data[i]. As long as the indirect_data size is a power of 2, indirect_data needn't even be physically contiguous. But I prefer all these tricks to be hidden within the driver. It seems a good idea in the beginning to rig the device to second-guess what a driver could do, but then it makes things awkward. (This is also why I'd rather get rid of VRING_DESC_F_NEXT). > If all descriptors are known ahead of the time to be indirect we could > pack more of them per cache line by dropping addr (and maybe len?). > This might work well for virtio blk. Nope, virtio-blk doesn't know at all the size of the sg list (for small-I/O workloads it is always 2, but potentially unlimited). For the above ideas I was thinking of the virtio tx header (but only for MTU=1500, as Stephen pointed out). Paolo --------------------------------------------------------------------- To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org