RE: [PATCH v2 3/3] Drivers: hv: Never allocate anything besides framebuffer from framebuffer memory region

2022-08-27 Thread Vitaly Kuznetsov
"Michael Kelley (LINUX)"  writes:

> From: Vitaly Kuznetsov  Sent: Thursday, August 25, 2022 
> 2:00 AM
>> 
>> Passed through PCI device sometimes misbehave on Gen1 VMs when Hyper-V
>> DRM driver is also loaded. Looking at IOMEM assignment, we can see e.g.
>> 
>> $ cat /proc/iomem
>> ...
>> f800-fffb : PCI Bus :00
>>   f800-fbff : :00:08.0
>> f800-f8001fff : bb8c4f33-2ba2-4808-9f7f-02f3b4da22fe
>> ...
>> fe000-f : PCI Bus :00
>>   fe000-fe07f : bb8c4f33-2ba2-4808-9f7f-02f3b4da22fe
>> fe000-fe07f : 2ba2:00:02.0
>>   fe000-fe07f : mlx4_core
>> 
>> the interesting part is the 'f800' region as it is actually the
>> VM's framebuffer:
>> 
>> $ lspci -v
>> ...
>> :00:08.0 VGA compatible controller: Microsoft Corporation Hyper-V 
>> virtual VGA
>> (prog-if 00 [VGA controller])
>>  Flags: bus master, fast devsel, latency 0, IRQ 11
>>  Memory at f800 (32-bit, non-prefetchable) [size=64M]
>> ...
>> 
>>  hv_vmbus: registering driver hyperv_drm
>>  hyperv_drm 5620e0c7-8062-4dce-aeb7-520c7ef76171: [drm] Synthvid Version 
>> major 3, minor 5
>>  hyperv_drm :00:08.0: vgaarb: deactivate vga console
>>  hyperv_drm :00:08.0: BAR 0: can't reserve [mem 0xf800-0xfbff]
>>  hyperv_drm 5620e0c7-8062-4dce-aeb7-520c7ef76171: [drm] Cannot request 
>> framebuffer, boot fb still active?
>> 
>> Note: "Cannot request framebuffer" is not a fatal error in
>> hyperv_setup_gen1() as the code assumes there's some other framebuffer
>> device there but we actually have some other PCI device (mlx4 in this
>> case) config space there!
>
> My apologies for not getting around to commenting on the previous
> version of this patch.  The function hyperv_setup_gen1() and the
> "Cannot request framebuffer" message have gone away as of
> commit a0ab5abced55.
>

True, will fix!

>> 
>> The problem appears to be that vmbus_allocate_mmio() can allocate from
>> the reserved framebuffer region (fb_overlap_ok), however, if the
>> request to allocate MMIO comes from some other device before
>> framebuffer region is taken, it can happily use framebuffer region for
>> it. 
>
> Interesting. I had never looked at the details of vmbus_allocate_mmio().
> The semantics one might assume of a parameter named "fb_overlap_ok"
> aren't implemented because !fb_overlap_ok essentially has no effect.   The
> existing semantics are really "prefer_fb_overlap".  This patch implements
> the expected and needed semantics, which is to not allocate from the frame
> buffer space when !fb_overlap_ok.
>
> If that's an accurate high level summary, maybe this commit message
> could describe it that way?  The other details you provide about what can
> go wrong should still be included as well.

That's acually a very good summary! Let me update the commit message,
I'll be sending out v3 shortly.

>
>> Note, Gen2 VMs are usually unaffected by the issue because
>> framebuffer region is already taken by EFI fb (in case kernel supports
>> it) but Gen1 VMs may have this region unclaimed by the time Hyper-V PCI
>> pass-through driver tries allocating MMIO space if Hyper-V DRM/FB drivers
>> load after it. Devices can be brought up in any sequence so let's
>> resolve the issue by always ignoring 'fb_mmio' region for non-FB
>> requests, even if the region is unclaimed.
>> 
>> Signed-off-by: Vitaly Kuznetsov 
>> ---
>>  drivers/hv/vmbus_drv.c | 10 +-
>>  1 file changed, 9 insertions(+), 1 deletion(-)
>> 
>> diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
>> index 536f68e563c6..3c833ea60db6 100644
>> --- a/drivers/hv/vmbus_drv.c
>> +++ b/drivers/hv/vmbus_drv.c
>> @@ -2331,7 +2331,7 @@ int vmbus_allocate_mmio(struct resource **new, struct
>> hv_device *device_obj,
>>  bool fb_overlap_ok)
>>  {
>>  struct resource *iter, *shadow;
>> -resource_size_t range_min, range_max, start;
>> +resource_size_t range_min, range_max, start, end;
>>  const char *dev_n = dev_name(&device_obj->device);
>>  int retval;
>> 
>> @@ -2366,6 +2366,14 @@ int vmbus_allocate_mmio(struct resource **new, struct
>> hv_device *device_obj,
>>  range_max = iter->end;
>>  start = (range_min + align - 1) & ~(align - 1);
>>  for (; start + size - 1 <= range_max; start += align) {
>> +end = start + size - 1;
>> +
>> +/* Skip the whole fb_mmio region if not fb_overlap_ok */
>> +if (!fb_overlap_ok && fb_mmio &&
>> +(((start >= fb_mmio->start) && (start <= 
>> fb_mmio->end)) ||
>> + ((end >= fb_mmio->start) && (end <= 
>> fb_mmio->end
>> +continue;
>> +
>>  shadow = __request_region(iter, start, size, NULL,
>>IORESOURCE_BUSY);
>>  if (!shadow)
>> --
>> 2.37.1
>
> Other than my musings on the commit message,
>

RE: [PATCH v2 3/3] Drivers: hv: Never allocate anything besides framebuffer from framebuffer memory region

2022-08-25 Thread Michael Kelley (LINUX)
From: Vitaly Kuznetsov  Sent: Thursday, August 25, 2022 
2:00 AM
> 
> Passed through PCI device sometimes misbehave on Gen1 VMs when Hyper-V
> DRM driver is also loaded. Looking at IOMEM assignment, we can see e.g.
> 
> $ cat /proc/iomem
> ...
> f800-fffb : PCI Bus :00
>   f800-fbff : :00:08.0
> f800-f8001fff : bb8c4f33-2ba2-4808-9f7f-02f3b4da22fe
> ...
> fe000-f : PCI Bus :00
>   fe000-fe07f : bb8c4f33-2ba2-4808-9f7f-02f3b4da22fe
> fe000-fe07f : 2ba2:00:02.0
>   fe000-fe07f : mlx4_core
> 
> the interesting part is the 'f800' region as it is actually the
> VM's framebuffer:
> 
> $ lspci -v
> ...
> :00:08.0 VGA compatible controller: Microsoft Corporation Hyper-V virtual 
> VGA
> (prog-if 00 [VGA controller])
>   Flags: bus master, fast devsel, latency 0, IRQ 11
>   Memory at f800 (32-bit, non-prefetchable) [size=64M]
> ...
> 
>  hv_vmbus: registering driver hyperv_drm
>  hyperv_drm 5620e0c7-8062-4dce-aeb7-520c7ef76171: [drm] Synthvid Version 
> major 3, minor 5
>  hyperv_drm :00:08.0: vgaarb: deactivate vga console
>  hyperv_drm :00:08.0: BAR 0: can't reserve [mem 0xf800-0xfbff]
>  hyperv_drm 5620e0c7-8062-4dce-aeb7-520c7ef76171: [drm] Cannot request 
> framebuffer, boot fb still active?
> 
> Note: "Cannot request framebuffer" is not a fatal error in
> hyperv_setup_gen1() as the code assumes there's some other framebuffer
> device there but we actually have some other PCI device (mlx4 in this
> case) config space there!

My apologies for not getting around to commenting on the previous
version of this patch.  The function hyperv_setup_gen1() and the
"Cannot request framebuffer" message have gone away as of
commit a0ab5abced55.

> 
> The problem appears to be that vmbus_allocate_mmio() can allocate from
> the reserved framebuffer region (fb_overlap_ok), however, if the
> request to allocate MMIO comes from some other device before
> framebuffer region is taken, it can happily use framebuffer region for
> it. 

Interesting. I had never looked at the details of vmbus_allocate_mmio().
The semantics one might assume of a parameter named "fb_overlap_ok"
aren't implemented because !fb_overlap_ok essentially has no effect.   The
existing semantics are really "prefer_fb_overlap".  This patch implements
the expected and needed semantics, which is to not allocate from the frame
buffer space when !fb_overlap_ok.

If that's an accurate high level summary, maybe this commit message
could describe it that way?  The other details you provide about what can
go wrong should still be included as well.

> Note, Gen2 VMs are usually unaffected by the issue because
> framebuffer region is already taken by EFI fb (in case kernel supports
> it) but Gen1 VMs may have this region unclaimed by the time Hyper-V PCI
> pass-through driver tries allocating MMIO space if Hyper-V DRM/FB drivers
> load after it. Devices can be brought up in any sequence so let's
> resolve the issue by always ignoring 'fb_mmio' region for non-FB
> requests, even if the region is unclaimed.
> 
> Signed-off-by: Vitaly Kuznetsov 
> ---
>  drivers/hv/vmbus_drv.c | 10 +-
>  1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
> index 536f68e563c6..3c833ea60db6 100644
> --- a/drivers/hv/vmbus_drv.c
> +++ b/drivers/hv/vmbus_drv.c
> @@ -2331,7 +2331,7 @@ int vmbus_allocate_mmio(struct resource **new, struct
> hv_device *device_obj,
>   bool fb_overlap_ok)
>  {
>   struct resource *iter, *shadow;
> - resource_size_t range_min, range_max, start;
> + resource_size_t range_min, range_max, start, end;
>   const char *dev_n = dev_name(&device_obj->device);
>   int retval;
> 
> @@ -2366,6 +2366,14 @@ int vmbus_allocate_mmio(struct resource **new, struct
> hv_device *device_obj,
>   range_max = iter->end;
>   start = (range_min + align - 1) & ~(align - 1);
>   for (; start + size - 1 <= range_max; start += align) {
> + end = start + size - 1;
> +
> + /* Skip the whole fb_mmio region if not fb_overlap_ok */
> + if (!fb_overlap_ok && fb_mmio &&
> + (((start >= fb_mmio->start) && (start <= 
> fb_mmio->end)) ||
> +  ((end >= fb_mmio->start) && (end <= 
> fb_mmio->end
> + continue;
> +
>   shadow = __request_region(iter, start, size, NULL,
> IORESOURCE_BUSY);
>   if (!shadow)
> --
> 2.37.1

Other than my musings on the commit message,

Reviewed-by: Michael Kelley