Re: [Qemu-devel] [PATCH v4 4/6] vhost: add regions to temporary list

2017-12-18 Thread Dr. David Alan Gilbert
* Michael S. Tsirkin (m...@redhat.com) wrote:
> On Thu, Dec 14, 2017 at 04:27:31PM +0100, Igor Mammedov wrote:
> > Also it seems that we have a race in current code where
> > region_del() unrefs memory region first and then by the
> > commit time memory region could be gone since old flatview
> > is unreffed before commit callback is called, but guest still
> > uses old memory map until vhost_set_mem_table() is complete.
> > We probably should unref deleted(old) sections after
> > guest gets new memmap.
> 
> Care trying to post a patch for stable? Might be a good idea
> to merge before this rework, for the sake of downstreams.

I think the 1st patch of my v5 might be suitable for that; please
have a look.

Dave

> > 
> > >  }
> > >  
> > >  static int vhost_virtqueue_set_addr(struct vhost_dev *dev,
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK



Re: [Qemu-devel] [PATCH v4 4/6] vhost: add regions to temporary list

2017-12-15 Thread Dr. David Alan Gilbert
* Paolo Bonzini (pbonz...@redhat.com) wrote:
> On 15/12/2017 14:30, Dr. David Alan Gilbert wrote:
> >> Also it seems that we have a race in current code where
> >> region_del() unrefs memory region first and then by the
> >> commit time memory region could be gone since old flatview
> >> is unreffed before commit callback is called, but guest still
> >> uses old memory map until vhost_set_mem_table() is complete.
> >> We probably should unref deleted(old) sections after
> >> guest gets new memmap.
> >
> > Will they really get cleaned up before the commit() returns?
> > There's no rcu like thing guarding it?
> 
> The memory subsystem only keeps them alive until before commmit() is
> invoked.

Hmm ok; I guess then we do need to keep the temporary list of
MemoryRegionSections and unref all the old ones after the end of the callback.

I'll rework it (again).

Dave


> Paolo
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK



Re: [Qemu-devel] [PATCH v4 4/6] vhost: add regions to temporary list

2017-12-15 Thread Paolo Bonzini
On 15/12/2017 14:30, Dr. David Alan Gilbert wrote:
>> Also it seems that we have a race in current code where
>> region_del() unrefs memory region first and then by the
>> commit time memory region could be gone since old flatview
>> is unreffed before commit callback is called, but guest still
>> uses old memory map until vhost_set_mem_table() is complete.
>> We probably should unref deleted(old) sections after
>> guest gets new memmap.
>
> Will they really get cleaned up before the commit() returns?
> There's no rcu like thing guarding it?

The memory subsystem only keeps them alive until before commmit() is
invoked.

Paolo



Re: [Qemu-devel] [PATCH v4 4/6] vhost: add regions to temporary list

2017-12-15 Thread Dr. David Alan Gilbert
* Igor Mammedov (imamm...@redhat.com) wrote:
> On Wed, 13 Dec 2017 18:08:05 +
> "Dr. David Alan Gilbert (git)"  wrote:
> 
> > From: "Dr. David Alan Gilbert" 
> > 
> > As regions are reported by the listener to the _nop and _add
> > methods, add them to our new temporary list.
> > Regions that abut can be merged if the backend allows.
> > 
> > Signed-off-by: Dr. David Alan Gilbert 
> > ---
> >  hw/virtio/trace-events |  2 ++
> >  hw/virtio/vhost.c  | 70 
> > ++
> >  2 files changed, 72 insertions(+)
> > 
> > diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
> > index 4a493bcd46..7de0663652 100644
> > --- a/hw/virtio/trace-events
> > +++ b/hw/virtio/trace-events
> > @@ -1,6 +1,8 @@
> >  # See docs/devel/tracing.txt for syntax documentation.
> >  
> >  # hw/virtio/vhost.c
> > +vhost_region_add_tmp(const char *name, uint64_t gpa, uint64_t size, 
> > uint64_t host) "%s: 0x%"PRIx64"+0x%"PRIx64" @ 0x%"PRIx64
> > +vhost_region_add_tmp_abut(const char *name, uint64_t new_size) "%s: 
> > 0x%"PRIx64
> >  vhost_section(const char *name, int r) "%s:%d"
> >  
> >  # hw/virtio/virtio.c
> > diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> > index 4523f45587..2084888aa7 100644
> > --- a/hw/virtio/vhost.c
> > +++ b/hw/virtio/vhost.c
> > @@ -694,6 +694,67 @@ static void vhost_commit(MemoryListener *listener)
> >  dev->memory_changed = false;
> >  }
> >  
> > +/* Adds the section data to the tmp_mem structure.
> > + * It relies on the listener calling us in memory address order
> > + * and for each region (via the _add and _nop methods).
> > + */
> > +static void vhost_region_add_tmp(struct vhost_dev *dev,
> > + MemoryRegionSection *section)
> > +{
> > +bool need_add = true;
> > +uint64_t mrs_size = int128_get64(section->size);
> > +uint64_t mrs_gpa = section->offset_within_address_space;
> > +uintptr_t mrs_host = (uintptr_t)memory_region_get_ram_ptr(section->mr) 
> > +
> > + section->offset_within_region;
> > +
> > +trace_vhost_region_add_tmp(section->mr->name, mrs_gpa, mrs_size, 
> > mrs_host);
> > +
> > +if (dev->tmp_mem->nregions) {
> > +/* Since we already have at least one region, lets see if
> > + * this extends it; since we're scanning in order, we only
> > + * have to look at the last one, and the FlatView that calls
> > + * us shouldn't have overlaps.
> > + */
> > +struct vhost_memory_region *prev_vmr = dev->tmp_mem->regions +
> > +   (dev->tmp_mem->nregions - 
> > 1);
> > +uint64_t prev_gpa_start = prev_vmr->guest_phys_addr;
> > +uint64_t prev_gpa_end   = range_get_last(prev_gpa_start,
> > + prev_vmr->memory_size);
> > +uint64_t prev_host_start = prev_vmr->userspace_addr;
> > +uint64_t prev_host_end   = range_get_last(prev_host_start,
> > +  prev_vmr->memory_size);
> > +
> > +if (prev_gpa_end + 1 == mrs_gpa &&
> > +prev_host_end + 1 == mrs_host &&
> > +(!dev->vhost_ops->vhost_backend_can_merge ||
> > +dev->vhost_ops->vhost_backend_can_merge(dev,
> > +mrs_host, mrs_size,
> > +prev_host_start, prev_vmr->memory_size))) {
> > +/* The two regions abut */
> > +need_add = false;
> > +mrs_size = mrs_size + prev_vmr->memory_size;
> > +prev_vmr->memory_size = mrs_size;
> > +trace_vhost_region_add_tmp_abut(section->mr->name, mrs_size);
> > +}
> > +}
> > +
> > +if (need_add) {
> > +uint32_t nregions = dev->tmp_mem->nregions;
> > +size_t s = offsetof(struct vhost_memory, regions) +
> > +(nregions + 1) * sizeof 
> > dev->tmp_mem->regions[0];
> > +dev->tmp_mem = g_realloc(dev->tmp_mem, s);
> > +dev->tmp_mem->nregions++;
> > +struct vhost_memory_region *cur_vmr = 
> > >tmp_mem->regions[nregions];
> > +
> > +cur_vmr->guest_phys_addr = mrs_gpa;
> > +cur_vmr->memory_size = mrs_size;
> > +cur_vmr->userspace_addr  = mrs_host;
> > +cur_vmr->flags_padding = 0;
> > +}
> > +
> > +
> > +}
> > +
> >  static void vhost_region_add(MemoryListener *listener,
> >   MemoryRegionSection *section)
> >  {
> > @@ -703,6 +764,7 @@ static void vhost_region_add(MemoryListener *listener,
> >  if (!vhost_section(section)) {
> >  return;
> >  }
> > +vhost_region_add_tmp(dev, section);
> >  
> >  ++dev->n_mem_sections;
> >  dev->mem_sections = g_renew(MemoryRegionSection, dev->mem_sections,
> > @@ -800,9 +862,17 @@ static void vhost_iommu_region_del(MemoryListener 
> > *listener,
> >  }
> >  }
> >  
> 

Re: [Qemu-devel] [PATCH v4 4/6] vhost: add regions to temporary list

2017-12-14 Thread Michael S. Tsirkin
On Thu, Dec 14, 2017 at 04:27:31PM +0100, Igor Mammedov wrote:
> Also it seems that we have a race in current code where
> region_del() unrefs memory region first and then by the
> commit time memory region could be gone since old flatview
> is unreffed before commit callback is called, but guest still
> uses old memory map until vhost_set_mem_table() is complete.
> We probably should unref deleted(old) sections after
> guest gets new memmap.

Care trying to post a patch for stable? Might be a good idea
to merge before this rework, for the sake of downstreams.

> 
> >  }
> >  
> >  static int vhost_virtqueue_set_addr(struct vhost_dev *dev,



Re: [Qemu-devel] [PATCH v4 4/6] vhost: add regions to temporary list

2017-12-14 Thread Dr. David Alan Gilbert
* Paolo Bonzini (pbonz...@redhat.com) wrote:
> On 13/12/2017 19:08, Dr. David Alan Gilbert (git) wrote:
> > +if (dev->tmp_mem->nregions) {
> > +/* Since we already have at least one region, lets see if
> > + * this extends it; since we're scanning in order, we only
> > + * have to look at the last one, and the FlatView that calls
> > + * us shouldn't have overlaps.
> > + */
> > +struct vhost_memory_region *prev_vmr = dev->tmp_mem->regions +
> > +   (dev->tmp_mem->nregions - 
> > 1);
> > +uint64_t prev_gpa_start = prev_vmr->guest_phys_addr;
> > +uint64_t prev_gpa_end   = range_get_last(prev_gpa_start,
> > + prev_vmr->memory_size);
> > +uint64_t prev_host_start = prev_vmr->userspace_addr;
> > +uint64_t prev_host_end   = range_get_last(prev_host_start,
> > +  prev_vmr->memory_size);
> > +
> > +if (prev_gpa_end + 1 == mrs_gpa &&
> > +prev_host_end + 1 == mrs_host &&
> > +(!dev->vhost_ops->vhost_backend_can_merge ||
> > +dev->vhost_ops->vhost_backend_can_merge(dev,
> > +mrs_host, mrs_size,
> > +prev_host_start, prev_vmr->memory_size))) {
> > +/* The two regions abut */
> > +need_add = false;
> > +mrs_size = mrs_size + prev_vmr->memory_size;
> > +prev_vmr->memory_size = mrs_size;
> > +trace_vhost_region_add_tmp_abut(section->mr->name, mrs_size);
> > +}
> > +}
> 
> Interesting, in which cases does this actually trigger?

:vhost_section vga-lowmem:0
:vhost_section /objects/mem:1
:vhost_region_add_tmp /objects/mem: 0x0+0xa @ 0x7ff1f5a08000
:vhost_section vga.vram:0
:vhost_section vga-lowmem:0
:vhost_section /objects/mem:1
:vhost_region_add_tmp /objects/mem: 0xc+0xa000 @ 0x7ff1f5ac8000
:vhost_section /objects/mem:1
:vhost_region_add_tmp /objects/mem: 0xca000+0x3000 @ 0x7ff1f5ad2000
>vhost_region_add_tmp_abut /objects/mem: 0xd000
:vhost_section /objects/mem:1
:vhost_region_add_tmp /objects/mem: 0xcd000+0x1f000 @ 0x7ff1f5ad5000
>vhost_region_add_tmp_abut /objects/mem: 0x2c000
:vhost_section /objects/mem:1
:vhost_region_add_tmp /objects/mem: 0xec000+0x4000 @ 0x7ff1f5af4000
>vhost_region_add_tmp_abut /objects/mem: 0x3
:vhost_section /objects/mem:1
:vhost_region_add_tmp /objects/mem: 0xf+0x1 @ 0x7ff1f5af8000
>vhost_region_add_tmp_abut /objects/mem: 0x4
:vhost_section /objects/mem:1
:vhost_region_add_tmp /objects/mem: 0x10+0x3ff0 @ 0x7ff1f5b08000
>vhost_region_add_tmp_abut /objects/mem: 0x3ff4
:vhost_section vga.vram:0
:vhost_section vga ioports remapped:0
:vhost_section bochs dispi interface:0
:vhost_section qemu extended regs:0
:vhost_section msix-table:0
:vhost_section msix-pba:0
:vhost_section kvm-ioapic:0
:vhost_section hpet:0
:vhost_section kvm-apic-msi:0
:vhost_section pc.bios:0
:vhost_commit Started: 1 Changed: 0

So it's not unusual for us to piece them back together into one
chunk.

Dave

> Paolo
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK



Re: [Qemu-devel] [PATCH v4 4/6] vhost: add regions to temporary list

2017-12-14 Thread Igor Mammedov
On Wed, 13 Dec 2017 18:08:05 +
"Dr. David Alan Gilbert (git)"  wrote:

> From: "Dr. David Alan Gilbert" 
> 
> As regions are reported by the listener to the _nop and _add
> methods, add them to our new temporary list.
> Regions that abut can be merged if the backend allows.
> 
> Signed-off-by: Dr. David Alan Gilbert 
> ---
>  hw/virtio/trace-events |  2 ++
>  hw/virtio/vhost.c  | 70 
> ++
>  2 files changed, 72 insertions(+)
> 
> diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
> index 4a493bcd46..7de0663652 100644
> --- a/hw/virtio/trace-events
> +++ b/hw/virtio/trace-events
> @@ -1,6 +1,8 @@
>  # See docs/devel/tracing.txt for syntax documentation.
>  
>  # hw/virtio/vhost.c
> +vhost_region_add_tmp(const char *name, uint64_t gpa, uint64_t size, uint64_t 
> host) "%s: 0x%"PRIx64"+0x%"PRIx64" @ 0x%"PRIx64
> +vhost_region_add_tmp_abut(const char *name, uint64_t new_size) "%s: 
> 0x%"PRIx64
>  vhost_section(const char *name, int r) "%s:%d"
>  
>  # hw/virtio/virtio.c
> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> index 4523f45587..2084888aa7 100644
> --- a/hw/virtio/vhost.c
> +++ b/hw/virtio/vhost.c
> @@ -694,6 +694,67 @@ static void vhost_commit(MemoryListener *listener)
>  dev->memory_changed = false;
>  }
>  
> +/* Adds the section data to the tmp_mem structure.
> + * It relies on the listener calling us in memory address order
> + * and for each region (via the _add and _nop methods).
> + */
> +static void vhost_region_add_tmp(struct vhost_dev *dev,
> + MemoryRegionSection *section)
> +{
> +bool need_add = true;
> +uint64_t mrs_size = int128_get64(section->size);
> +uint64_t mrs_gpa = section->offset_within_address_space;
> +uintptr_t mrs_host = (uintptr_t)memory_region_get_ram_ptr(section->mr) +
> + section->offset_within_region;
> +
> +trace_vhost_region_add_tmp(section->mr->name, mrs_gpa, mrs_size, 
> mrs_host);
> +
> +if (dev->tmp_mem->nregions) {
> +/* Since we already have at least one region, lets see if
> + * this extends it; since we're scanning in order, we only
> + * have to look at the last one, and the FlatView that calls
> + * us shouldn't have overlaps.
> + */
> +struct vhost_memory_region *prev_vmr = dev->tmp_mem->regions +
> +   (dev->tmp_mem->nregions - 1);
> +uint64_t prev_gpa_start = prev_vmr->guest_phys_addr;
> +uint64_t prev_gpa_end   = range_get_last(prev_gpa_start,
> + prev_vmr->memory_size);
> +uint64_t prev_host_start = prev_vmr->userspace_addr;
> +uint64_t prev_host_end   = range_get_last(prev_host_start,
> +  prev_vmr->memory_size);
> +
> +if (prev_gpa_end + 1 == mrs_gpa &&
> +prev_host_end + 1 == mrs_host &&
> +(!dev->vhost_ops->vhost_backend_can_merge ||
> +dev->vhost_ops->vhost_backend_can_merge(dev,
> +mrs_host, mrs_size,
> +prev_host_start, prev_vmr->memory_size))) {
> +/* The two regions abut */
> +need_add = false;
> +mrs_size = mrs_size + prev_vmr->memory_size;
> +prev_vmr->memory_size = mrs_size;
> +trace_vhost_region_add_tmp_abut(section->mr->name, mrs_size);
> +}
> +}
> +
> +if (need_add) {
> +uint32_t nregions = dev->tmp_mem->nregions;
> +size_t s = offsetof(struct vhost_memory, regions) +
> +(nregions + 1) * sizeof dev->tmp_mem->regions[0];
> +dev->tmp_mem = g_realloc(dev->tmp_mem, s);
> +dev->tmp_mem->nregions++;
> +struct vhost_memory_region *cur_vmr = 
> >tmp_mem->regions[nregions];
> +
> +cur_vmr->guest_phys_addr = mrs_gpa;
> +cur_vmr->memory_size = mrs_size;
> +cur_vmr->userspace_addr  = mrs_host;
> +cur_vmr->flags_padding = 0;
> +}
> +
> +
> +}
> +
>  static void vhost_region_add(MemoryListener *listener,
>   MemoryRegionSection *section)
>  {
> @@ -703,6 +764,7 @@ static void vhost_region_add(MemoryListener *listener,
>  if (!vhost_section(section)) {
>  return;
>  }
> +vhost_region_add_tmp(dev, section);
>  
>  ++dev->n_mem_sections;
>  dev->mem_sections = g_renew(MemoryRegionSection, dev->mem_sections,
> @@ -800,9 +862,17 @@ static void vhost_iommu_region_del(MemoryListener 
> *listener,
>  }
>  }
>  
> +/* Called on regions that have not changed */
>  static void vhost_region_nop(MemoryListener *listener,
>   MemoryRegionSection *section)
>  {
move it close to add/del callbacks

> +struct vhost_dev *dev = container_of(listener, struct vhost_dev,
> +  

Re: [Qemu-devel] [PATCH v4 4/6] vhost: add regions to temporary list

2017-12-13 Thread Paolo Bonzini
On 13/12/2017 19:08, Dr. David Alan Gilbert (git) wrote:
> +if (dev->tmp_mem->nregions) {
> +/* Since we already have at least one region, lets see if
> + * this extends it; since we're scanning in order, we only
> + * have to look at the last one, and the FlatView that calls
> + * us shouldn't have overlaps.
> + */
> +struct vhost_memory_region *prev_vmr = dev->tmp_mem->regions +
> +   (dev->tmp_mem->nregions - 1);
> +uint64_t prev_gpa_start = prev_vmr->guest_phys_addr;
> +uint64_t prev_gpa_end   = range_get_last(prev_gpa_start,
> + prev_vmr->memory_size);
> +uint64_t prev_host_start = prev_vmr->userspace_addr;
> +uint64_t prev_host_end   = range_get_last(prev_host_start,
> +  prev_vmr->memory_size);
> +
> +if (prev_gpa_end + 1 == mrs_gpa &&
> +prev_host_end + 1 == mrs_host &&
> +(!dev->vhost_ops->vhost_backend_can_merge ||
> +dev->vhost_ops->vhost_backend_can_merge(dev,
> +mrs_host, mrs_size,
> +prev_host_start, prev_vmr->memory_size))) {
> +/* The two regions abut */
> +need_add = false;
> +mrs_size = mrs_size + prev_vmr->memory_size;
> +prev_vmr->memory_size = mrs_size;
> +trace_vhost_region_add_tmp_abut(section->mr->name, mrs_size);
> +}
> +}

Interesting, in which cases does this actually trigger?

Paolo