Re: [Qemu-devel] [PATCH for-4.0 0/6] vhost-user-blk: Add support for backend reconnecting

Yongji Xie Wed, 12 Dec 2018 01:28:42 -0800

On Wed, 12 Dec 2018 at 15:47, Jason Wang <jasow...@redhat.com> wrote:
>
>
> On 2018/12/12 下午2:41, Yongji Xie wrote:
> > On Wed, 12 Dec 2018 at 12:07, Jason Wang <jasow...@redhat.com> wrote:
> >>
> >> On 2018/12/12 上午11:21, Yongji Xie wrote:
> >>> On Wed, 12 Dec 2018 at 11:00, Jason Wang <jasow...@redhat.com> wrote:
> >>>> On 2018/12/12 上午10:48, Yongji Xie wrote:
> >>>>> On Mon, 10 Dec 2018 at 17:32, Jason Wang <jasow...@redhat.com> wrote:
> >>>>>> On 2018/12/6 下午9:59, Michael S. Tsirkin wrote:
> >>>>>>> On Thu, Dec 06, 2018 at 09:57:22PM +0800, Jason Wang wrote:
> >>>>>>>> On 2018/12/6 下午2:35,elohi...@gmail.com  wrote:
> >>>>>>>>> From: Xie Yongji<xieyon...@baidu.com>
> >>>>>>>>>
> >>>>>>>>> This patchset is aimed at supporting qemu to reconnect
> >>>>>>>>> vhost-user-blk backend after vhost-user-blk backend crash or
> >>>>>>>>> restart.
> >>>>>>>>>
> >>>>>>>>> The patch 1 tries to implenment the sync connection for
> >>>>>>>>> "reconnect socket".
> >>>>>>>>>
> >>>>>>>>> The patch 2 introduces a new message VHOST_USER_SET_VRING_INFLIGHT
> >>>>>>>>> to support offering shared memory to backend to record
> >>>>>>>>> its inflight I/O.
> >>>>>>>>>
> >>>>>>>>> The patch 3,4 are the corresponding libvhost-user patches of
> >>>>>>>>> patch 2. Make libvhost-user support VHOST_USER_SET_VRING_INFLIGHT.
> >>>>>>>>>
> >>>>>>>>> The patch 5 supports vhost-user-blk to reconnect backend when
> >>>>>>>>> connection closed.
> >>>>>>>>>
> >>>>>>>>> The patch 6 tells qemu that we support reconnecting now.
> >>>>>>>>>
> >>>>>>>>> To use it, we could start qemu with:
> >>>>>>>>>
> >>>>>>>>> qemu-system-x86_64 \
> >>>>>>>>>              -chardev 
> >>>>>>>>> socket,id=char0,path=/path/vhost.socket,reconnect=1,wait \
> >>>>>>>>>              -device vhost-user-blk-pci,chardev=char0 \
> >>>>>>>>>
> >>>>>>>>> and start vhost-user-blk backend with:
> >>>>>>>>>
> >>>>>>>>> vhost-user-blk -b /path/file -s /path/vhost.socket
> >>>>>>>>>
> >>>>>>>>> Then we can restart vhost-user-blk at any time during VM running.
> >>>>>>>> I wonder whether or not it's better to handle this at the level of 
> >>>>>>>> virtio
> >>>>>>>> protocol itself instead of vhost-user level. E.g expose 
> >>>>>>>> last_avail_idx to
> >>>>>>>> driver might be sufficient?
> >>>>>>>>
> >>>>>>>> Another possible issue is, looks like you need to deal with 
> >>>>>>>> different kinds
> >>>>>>>> of ring layouts e.g packed virtqueues.
> >>>>>>>>
> >>>>>>>> Thanks
> >>>>>>> I'm not sure I understand your comments here.
> >>>>>>> All these would be guest-visible extensions.
> >>>>>> Looks not, it only introduces a shared memory between qemu and
> >>>>>> vhost-user backend?
> >>>>>>
> >>>>>>
> >>>>>>> Possible for sure but how is this related to
> >>>>>>> a patch supporting transparent reconnects?
> >>>>>> I might miss something. My understanding is that we support transparent
> >>>>>> reconnects, but we can't deduce an accurate last_avail_idx and this is
> >>>>>> what capability this series try to add. To me, this series is 
> >>>>>> functional
> >>>>>> equivalent to expose last_avail_idx (or avail_idx_cons) in available
> >>>>>> ring. So the information is inside guest memory, vhost-user backend can
> >>>>>> access it and update it directly. I believe this is some modern NIC did
> >>>>>> as well (but index is in MMIO area of course).
> >>>>>>
> >>>>> Hi Jason,
> >>>>>
> >>>>> If my understand is correct, it might be not enough to only expose
> >>>>> last_avail_idx.
> >>>>> Because we would not process descriptors in the same order in which 
> >>>>> they have
> >>>>> been made available sometimes. If so, we can't get correct inflight
> >>>>> I/O information
> >>>>> from available ring.
> >>>> You can get this with the help of the both used ring and last_avail_idx
> >>>> I believe. Or maybe you can give us an example?
> >>>>
> >>> A simple example, we assume ring size is 8:
> >>>
> >>> 1. guest fill avail ring
> >>>
> >>> avail ring: 0 1 2 3 4 5 6 7
> >>> used ring:
> >>>
> >>> 2. vhost-user backend complete 4,5,6,7 and fill used ring
> >>>
> >>> avail ring: 0 1 2 3 4 5 6 7
> >>> used ring: 4 5 6 7
> >>>
> >>> 3. guest fill avail ring again
> >>>
> >>> avail ring: 4 5 6 7 4 5 6 7
> >>> used ring: 4 5 6 7
> >>>
> >>> 4. vhost-user backend crash
> >>>
> >>> The inflight descriptors 0, 1, 2, 3 lost.
> >>>
> >>> Thanks,
> >>> Yongji
> >>
> >> Ok, then we can simply forbid increasing the avail_idx in this case?
> >>
> >> Basically, it's a question of whether or not it's better to done it in
> >> the level of virtio instead of vhost. I'm pretty sure if we expose
> >> sufficient information, it could be done without touching vhost-user.
> >> And we won't deal with e.g migration and other cases.
> >>
> > OK, I get your point. That's indeed an alternative way. But this feature 
> > seems
> > to be only useful to vhost-user backend.
>
>
> I admit I could not think of a use case other than vhost-user.
>
>
> >   I'm not sure whether it make sense to
> > touch virtio protocol for this feature.
>
>
> Some possible advantages:
>
> - Feature could be determined and noticed by user or management layer.
>
> - There's no need to invent ring layout specific protocol to record in
> flight descriptors. E.g if my understanding is correct, for this series
> and for the example above, it still can not work for packed virtqueue
> since descriptor id is not sufficient (descriptor could be overwritten
> by used one). You probably need to have a (partial) copy of descriptor
> ring for this.
>
> - No need to deal with migration, all information was in guest memory.
>


Yes, we have those advantages. But seems like handle this in vhost-user
level could be easier to be maintained in production environment. We can
support old guest. And the bug fix will not depend on guest kernel updating.

Thanks,
Yongji

Re: [Qemu-devel] [PATCH for-4.0 0/6] vhost-user-blk: Add support for backend reconnecting

Reply via email to