On Wed, 12 Dec 2018 at 15:47, Jason Wang <jasow...@redhat.com> wrote: > > > On 2018/12/12 下午2:41, Yongji Xie wrote: > > On Wed, 12 Dec 2018 at 12:07, Jason Wang <jasow...@redhat.com> wrote: > >> > >> On 2018/12/12 上午11:21, Yongji Xie wrote: > >>> On Wed, 12 Dec 2018 at 11:00, Jason Wang <jasow...@redhat.com> wrote: > >>>> On 2018/12/12 上午10:48, Yongji Xie wrote: > >>>>> On Mon, 10 Dec 2018 at 17:32, Jason Wang <jasow...@redhat.com> wrote: > >>>>>> On 2018/12/6 下午9:59, Michael S. Tsirkin wrote: > >>>>>>> On Thu, Dec 06, 2018 at 09:57:22PM +0800, Jason Wang wrote: > >>>>>>>> On 2018/12/6 下午2:35,elohi...@gmail.com wrote: > >>>>>>>>> From: Xie Yongji<xieyon...@baidu.com> > >>>>>>>>> > >>>>>>>>> This patchset is aimed at supporting qemu to reconnect > >>>>>>>>> vhost-user-blk backend after vhost-user-blk backend crash or > >>>>>>>>> restart. > >>>>>>>>> > >>>>>>>>> The patch 1 tries to implenment the sync connection for > >>>>>>>>> "reconnect socket". > >>>>>>>>> > >>>>>>>>> The patch 2 introduces a new message VHOST_USER_SET_VRING_INFLIGHT > >>>>>>>>> to support offering shared memory to backend to record > >>>>>>>>> its inflight I/O. > >>>>>>>>> > >>>>>>>>> The patch 3,4 are the corresponding libvhost-user patches of > >>>>>>>>> patch 2. Make libvhost-user support VHOST_USER_SET_VRING_INFLIGHT. > >>>>>>>>> > >>>>>>>>> The patch 5 supports vhost-user-blk to reconnect backend when > >>>>>>>>> connection closed. > >>>>>>>>> > >>>>>>>>> The patch 6 tells qemu that we support reconnecting now. > >>>>>>>>> > >>>>>>>>> To use it, we could start qemu with: > >>>>>>>>> > >>>>>>>>> qemu-system-x86_64 \ > >>>>>>>>> -chardev > >>>>>>>>> socket,id=char0,path=/path/vhost.socket,reconnect=1,wait \ > >>>>>>>>> -device vhost-user-blk-pci,chardev=char0 \ > >>>>>>>>> > >>>>>>>>> and start vhost-user-blk backend with: > >>>>>>>>> > >>>>>>>>> vhost-user-blk -b /path/file -s /path/vhost.socket > >>>>>>>>> > >>>>>>>>> Then we can restart vhost-user-blk at any time during VM running. > >>>>>>>> I wonder whether or not it's better to handle this at the level of > >>>>>>>> virtio > >>>>>>>> protocol itself instead of vhost-user level. E.g expose > >>>>>>>> last_avail_idx to > >>>>>>>> driver might be sufficient? > >>>>>>>> > >>>>>>>> Another possible issue is, looks like you need to deal with > >>>>>>>> different kinds > >>>>>>>> of ring layouts e.g packed virtqueues. > >>>>>>>> > >>>>>>>> Thanks > >>>>>>> I'm not sure I understand your comments here. > >>>>>>> All these would be guest-visible extensions. > >>>>>> Looks not, it only introduces a shared memory between qemu and > >>>>>> vhost-user backend? > >>>>>> > >>>>>> > >>>>>>> Possible for sure but how is this related to > >>>>>>> a patch supporting transparent reconnects? > >>>>>> I might miss something. My understanding is that we support transparent > >>>>>> reconnects, but we can't deduce an accurate last_avail_idx and this is > >>>>>> what capability this series try to add. To me, this series is > >>>>>> functional > >>>>>> equivalent to expose last_avail_idx (or avail_idx_cons) in available > >>>>>> ring. So the information is inside guest memory, vhost-user backend can > >>>>>> access it and update it directly. I believe this is some modern NIC did > >>>>>> as well (but index is in MMIO area of course). > >>>>>> > >>>>> Hi Jason, > >>>>> > >>>>> If my understand is correct, it might be not enough to only expose > >>>>> last_avail_idx. > >>>>> Because we would not process descriptors in the same order in which > >>>>> they have > >>>>> been made available sometimes. If so, we can't get correct inflight > >>>>> I/O information > >>>>> from available ring. > >>>> You can get this with the help of the both used ring and last_avail_idx > >>>> I believe. Or maybe you can give us an example? > >>>> > >>> A simple example, we assume ring size is 8: > >>> > >>> 1. guest fill avail ring > >>> > >>> avail ring: 0 1 2 3 4 5 6 7 > >>> used ring: > >>> > >>> 2. vhost-user backend complete 4,5,6,7 and fill used ring > >>> > >>> avail ring: 0 1 2 3 4 5 6 7 > >>> used ring: 4 5 6 7 > >>> > >>> 3. guest fill avail ring again > >>> > >>> avail ring: 4 5 6 7 4 5 6 7 > >>> used ring: 4 5 6 7 > >>> > >>> 4. vhost-user backend crash > >>> > >>> The inflight descriptors 0, 1, 2, 3 lost. > >>> > >>> Thanks, > >>> Yongji > >> > >> Ok, then we can simply forbid increasing the avail_idx in this case? > >> > >> Basically, it's a question of whether or not it's better to done it in > >> the level of virtio instead of vhost. I'm pretty sure if we expose > >> sufficient information, it could be done without touching vhost-user. > >> And we won't deal with e.g migration and other cases. > >> > > OK, I get your point. That's indeed an alternative way. But this feature > > seems > > to be only useful to vhost-user backend. > > > I admit I could not think of a use case other than vhost-user. > > > > I'm not sure whether it make sense to > > touch virtio protocol for this feature. > > > Some possible advantages: > > - Feature could be determined and noticed by user or management layer. > > - There's no need to invent ring layout specific protocol to record in > flight descriptors. E.g if my understanding is correct, for this series > and for the example above, it still can not work for packed virtqueue > since descriptor id is not sufficient (descriptor could be overwritten > by used one). You probably need to have a (partial) copy of descriptor > ring for this. > > - No need to deal with migration, all information was in guest memory. >
Yes, we have those advantages. But seems like handle this in vhost-user level could be easier to be maintained in production environment. We can support old guest. And the bug fix will not depend on guest kernel updating. Thanks, Yongji