Rusty Russell wrote:
> On Mon, 2007-06-18 at 12:09 +0300, Avi Kivity wrote:
>   
>> Rusty Russell wrote:
>>     
>>> On Sun, 2007-06-17 at 17:25 +0300, Avi Kivity wrote:
>>>   
>>>       
>>>> Rusty Russell wrote:
>>>>     
>>>>         
>>>>> + /* Set up for reply. */
>>>>> + vblk->sg[0].page = virt_to_page(&vbr->in_hdr);
>>>>> + vblk->sg[0].offset = offset_in_page(&vbr->in_hdr);
>>>>> + vblk->sg[0].length = sizeof(vbr->in_hdr);
>>>>> + num = blk_rq_map_sg(q, vbr->req, vblk->sg+1);
>>>>> + vbr->out_hdr.id = vblk->vdev->ops->add_inbuf(vblk->vdev, vblk->sg,
>>>>> +                                              1+num, vbr);
>>>>> + if (IS_ERR_VALUE(vbr->out_hdr.id))
>>>>> +         goto full;
>>>>> +
>>>>> + vblk->sg[0].page = virt_to_page(&vbr->out_hdr);
>>>>> + vblk->sg[0].offset = offset_in_page(&vbr->out_hdr);
>>>>> + vblk->sg[0].length = sizeof(vbr->out_hdr);
>>>>> +
>>>>> + vbr->out_id = vblk->vdev->ops->add_outbuf(vblk->vdev, vblk->sg, 1,
>>>>> +                                           vbr);
>>>>>       
>>>>>           
>>>> This strikes me as wasteful.  Why not set up a single descriptor which 
>>>> contains both placement and the data itself?
>>>>     
>>>>         
>>> We could actually do this for write, but not for read (where the length
>>> & sector need to be sent to other end, and the data comes from other
>>> end).
>>>       
>> So you map the first sg entry for output, and the rest for input.
>>
>> Less pretty, but more efficient.
>>     
>
> At the moment there are two kinds of sgs: input and output.  You're
> proposing instead that we have a single kind, and a flag on each segment
> which indicates direction.  That may make block a little efficient,
> depending on particular implementation of virtio.
>
> Yet network really wants in and out separate.  I'm not even sure how
> we'd implement a mixed-sg scheme efficiently for that case.  At the
> moment it's very straightforward.
>
>   

I think the differences are deeper than that.  Network wants two queues, 
disk wants one queue.  The directionality of the dma transfer is 
orthogonal to that.

For a disk, the initiator is always the guest; you issue a request and 
expend a completion within a few milliseconds.  Read and write requests 
are similar in nature; you want them in a single queue because you want 
the elevator to run on both read and write request.  When idling, the 
queue is empty.

For a network interface, the guest initiates transmits, but the host 
initiates receives.  Yes, the guest preloads the queue with rx 
descriptors, but it can't expect a response in any given time, hence the 
need for two queues.  When idling, the tx queue is empty, the rx queue 
is full.

So I'd suggest having a struct virtio_queue to represent a queue (so a 
network interface would have two of them), and the dma direction to be 
specified separately:

+       unsigned long (*add_buf)(struct virtio_queue *q,
+                                const struct scatterlist out_sg[], int out_nr,
+                                const struct scatterlist in_sg[], int in_nr,
+                                void *data);


I think this fits better with the Xen model (which has one ring for 
block, two each for network and console, if memory serves).

> Of course, I'm carrying two implementations and two drivers, so it's
> natural for me to try to resist changes 8

Life is rough on the frontier.  But all the glory will be yours!

-- 
error compiling committee.c: too many arguments to function


-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

Reply via email to