On 2017/11/16 16:54, Jason Wang wrote:
> > > On 2017年11月16日 13:53, Longpeng (Mike) wrote: >> On 2017/11/15 23:54, Longpeng(Mike) wrote: >>> 2017-11-15 23:05 GMT+08:00 Jason Wang<jasow...@redhat.com>: >>>> On 2017年11月15日 22:55, Longpeng(Mike) wrote: >>>>> Hi guys, >>>>> >>>>> We got a BUG report from our testers yesterday, the testing scenario was >>>>> migrating a VM (Windows guest, *4 vcpus*, 4GB, vhost-user net: *7 >>>>> queues*). >>>>> >>>>> We found the cause reason, and we'll report the BUG or send a fix patch >>>>> to upstream if necessary( we haven't test the upstream yet, sorry... ). >>>> Could you explain this a little bit more? >>>> >>>>> We want to know why the vhost_net_start() must start*total queues* ( in >>>>> our >>>>> VM there're 7 queues ) but not*the queues that current used* ( in our VM, >>>>> guest >>>>> only uses the first 4 queues because it's limited by the number of vcpus) >>>>> ? >>>>> >>>>> Looking forward to your help, thx:) >>>> Since the codes have been there for years and works well for kernel >>>> datapath. You should really explain what's wrong. >>>> >>> OK.:) >>> >>> In our scenario, the Windows's virtio-net driver only use the first 4 >>> queues and it >>> *only set desc/avail/used table for the first 4 queues*, so in QEMU >>> the desc/avail/ >>> used of the last 3 queues are ZERO, but unfortunately... >>> ''' >>> vhost_net_start >>> for (i = 0; i < total_queues; i++) >>> vhost_net_start_one >>> vhost_dev_start >>> vhost_virtqueue_start >>> ''' >>> In vhost_virtqueue_start(), it will calculate the HVA of >>> desc/avail/used table, so for last >>> 3 queues, it will use ZERO as the GPA to calculate the HVA, and then >>> send the results >>> to the user-mode backend ( we use*vhost-user* ) by >>> vhost_virtqueue_set_addr(). >>> >>> When the EVS get these address, it will update a*idx* which will be >>> treated as vq's >>> last_avail_idx when virtio-net stop ( pls see vhost_virtqueue_stop() ). >>> >>> So we get the following result after virtio-net stop: >>> the desc/avail/used of the last 3 queues's vqs are all ZERO, but these >>> vqs's >>> last_avail_idx is NOT ZERO. >>> >>> At last, virtio_load() reports an error: >>> ''' >>> if (!vdev->vq[i].vring.desc && vdev->vq[i].last_avail_idx) { // <-- >>> will be TRUE >>> error_report("VQ %d address 0x0 " >>> "inconsistent with Host index 0x%x", >>> i, vdev->vq[i].last_avail_idx); >>> return -1; >>> } >>> ''' >>> >>> BTW, the problem won't appear if use Linux guest, because the Linux >>> virtio-net >>> driver will set all 7 queues's desc/avail/used tables. And the problem >>> won't appear >>> if the VM use vhost-net, because vhost-net won't update*idx* in SET_ADDR >>> ioctl. > > Just to make sure I understand here, I thought Windows guest + vhost_net hit > this issue? > Windows guest + vhost-user hit. Windows guest + vhost-net is fine. ''' In vhost_virtqueue_start(), it will calculate the HVA of desc/avail/used tables, so for last 3 queues, it will use ZERO as the GPA to calculate the HVA, and then send the results to the user-mode backend ( we use *vhost-user* ) by vhost_virtqueue_set_addr(). ''' I think this is the root cause, it is strange, right ? > Thanks > >>> >>> Sorry for my pool English, Maybe I could describe the problem in Chinese >>> for you >>> in private if necessary. >>> >>> >>>> Thanks >> -- Regards, Longpeng(Mike) > > > . > -- Regards, Longpeng(Mike)