[Bug 106111] [GPU Passthrough]GPU (Polaris) not reinitialized with Linux VM (Reset bug)

2019-11-19 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=106111

Martin Peres  changed:

   What|Removed |Added

 Resolution|--- |MOVED
 Status|NEW |RESOLVED

--- Comment #8 from Martin Peres  ---
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been
closed from further activity.

You can subscribe and participate further through the new bug through this link
to our GitLab instance: https://gitlab.freedesktop.org/drm/amd/issues/346.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 106111] [GPU Passthrough]GPU (Polaris) not reinitialized with Linux VM (Reset bug)

2018-09-14 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=106111

--- Comment #7 from Andrew Sheldon  ---
Another workaround that has worked for me with a Vega 56 is to suspend-to-ram
the host system before trying to start the guest again.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 106111] [GPU Passthrough]GPU (Polaris) not reinitialized with Linux VM (Reset bug)

2018-08-18 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=106111

--- Comment #6 from Radosław Szkodziński  ---
This is still happening. It seems that these GPU need engine resets before bus
reset, similar to what was done for Fury and Polaris, but more extensive.

Temporary workaround (yeah sure) is to eject the driver - rmmod in guest or
eject in Windows. This resets the engines.

Windows did the resets on shutdown until version 18.5.1 where they broke
shutdown sequence again - read release notes on Radeon Pro Vega FE drivers
where they actually slightly care.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 106111] [GPU Passthrough]GPU (Polaris) not reinitialized with Linux VM (Reset bug)

2018-04-18 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=106111

--- Comment #5 from Max  ---
(In reply to Alex Williamson from comment #4)
> There is a difference, now we have:
> 
> [   84.997634] vfio_ecap_init: :0a:00.0 hiding ecap 0x19@0x270
> [   84.997645] vfio_ecap_init: :0a:00.0 hiding ecap 0x1b@0x2d0
> [   84.997653] vfio_ecap_init: :0a:00.0 hiding ecap 0x1e@0x370
> [  145.518307] vfio_ecap_init: :0a:00.0 hiding ecap 0x19@0x270
> [  145.518313] vfio_ecap_init: :0a:00.0 hiding ecap 0x1b@0x2d0
> [  145.518318] vfio_ecap_init: :0a:00.0 hiding ecap 0x1e@0x370
> 
> So prior to time 145.5 the VM was shutdown and started again and we could
> still read config space of the device.  Previously we were already getting
> IOMMU faults before the second startup.  But shortly after:
> 
> [  193.328586] AMD-Vi: Completion-Wait loop timed out
> [  193.488711] AMD-Vi: Completion-Wait loop timed out
> [  194.169913] iommu ivhd0: AMD-Vi: Event logged [
> [  194.169921] iommu ivhd0: IOTLB_INV_TIMEOUT device=0a:00.0
> address=0x00043e8aaca0]
> [  194.169924] iommu ivhd0: AMD-Vi: Event logged [
> [  194.169928] iommu ivhd0: IOTLB_INV_TIMEOUT device=0a:00.0
> address=0x00043e8aacc0]
> 
> And the stuck in D3 state is evidence that the device is no longer
> accessible on the bus.  So that only delayed the issue, some interaction
> between the IOMMU and GPU is still failing.

Thanks for the explaination Alex.
Something could be done ? 
By AMD or VFIO mainteners ?

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 106111] [GPU Passthrough]GPU (Polaris) not reinitialized with Linux VM (Reset bug)

2018-04-18 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=106111

--- Comment #4 from Alex Williamson  ---
There is a difference, now we have:

[   84.997634] vfio_ecap_init: :0a:00.0 hiding ecap 0x19@0x270
[   84.997645] vfio_ecap_init: :0a:00.0 hiding ecap 0x1b@0x2d0
[   84.997653] vfio_ecap_init: :0a:00.0 hiding ecap 0x1e@0x370
[  145.518307] vfio_ecap_init: :0a:00.0 hiding ecap 0x19@0x270
[  145.518313] vfio_ecap_init: :0a:00.0 hiding ecap 0x1b@0x2d0
[  145.518318] vfio_ecap_init: :0a:00.0 hiding ecap 0x1e@0x370

So prior to time 145.5 the VM was shutdown and started again and we could still
read config space of the device.  Previously we were already getting IOMMU
faults before the second startup.  But shortly after:

[  193.328586] AMD-Vi: Completion-Wait loop timed out
[  193.488711] AMD-Vi: Completion-Wait loop timed out
[  194.169913] iommu ivhd0: AMD-Vi: Event logged [
[  194.169921] iommu ivhd0: IOTLB_INV_TIMEOUT device=0a:00.0
address=0x00043e8aaca0]
[  194.169924] iommu ivhd0: AMD-Vi: Event logged [
[  194.169928] iommu ivhd0: IOTLB_INV_TIMEOUT device=0a:00.0
address=0x00043e8aacc0]

And the stuck in D3 state is evidence that the device is no longer accessible
on the bus.  So that only delayed the issue, some interaction between the IOMMU
and GPU is still failing.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 106111] [GPU Passthrough]GPU (Polaris) not reinitialized with Linux VM (Reset bug)

2018-04-18 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=106111

--- Comment #3 from Max  ---
Created attachment 138893
  --> https://bugs.freedesktop.org/attachment.cgi?id=138893=edit
dmesg after second launch + 4.17-rc1

Same problem with the Kernel 4.17-rc1. To be sure, i need to install this
kernel only on the Host, no need to install it on the Linux Guest ?

I use my own kernel 4.17 so maybe IOMMU/VFIO options are missing :

odelpasso@debian-desktop:~/Bureau$ cat /boot/config-4.17.0-rc1 | grep VFIO
CONFIG_VFIO_IOMMU_TYPE1=m
CONFIG_VFIO_VIRQFD=m
CONFIG_VFIO=m
# CONFIG_VFIO_NOIOMMU is not set
CONFIG_VFIO_PCI=m
CONFIG_VFIO_PCI_VGA=y
CONFIG_VFIO_PCI_MMAP=y
CONFIG_VFIO_PCI_INTX=y
CONFIG_VFIO_PCI_IGD=y
# CONFIG_VFIO_MDEV is not set
CONFIG_KVM_VFIO=y

odelpasso@debian-desktop:~/Bureau$ cat /boot/config-4.17.0-rc1 | grep IOMMU
# CONFIG_GART_IOMMU is not set
# CONFIG_CALGARY_IOMMU is not set
CONFIG_IOMMU_HELPER=y
CONFIG_VFIO_IOMMU_TYPE1=m
# CONFIG_VFIO_NOIOMMU is not set
CONFIG_IOMMU_API=y
CONFIG_IOMMU_SUPPORT=y
# Generic IOMMU Pagetable Support
CONFIG_IOMMU_IOVA=y
CONFIG_AMD_IOMMU=y
CONFIG_AMD_IOMMU_V2=y
# CONFIG_INTEL_IOMMU is not set

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 106111] [GPU Passthrough]GPU (Polaris) not reinitialized with Linux VM (Reset bug)

2018-04-17 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=106111

--- Comment #2 from Alex Williamson  ---
The IOMMU looks to be unhappy first:

[   40.201258] vfio_ecap_init: :0a:00.0 hiding ecap 0x19@0x270
[   40.201271] vfio_ecap_init: :0a:00.0 hiding ecap 0x1b@0x2d0
[   40.201279] vfio_ecap_init: :0a:00.0 hiding ecap 0x1e@0x370
[  159.958402] AMD-Vi: Completion-Wait loop timed out
[  160.118777] AMD-Vi: Completion-Wait loop timed out
[  160.799864] AMD-Vi: Event logged [
[  160.799868] IOTLB_INV_TIMEOUT device=0a:00.0 address=0x00043e8e8550]
[  160.799872] AMD-Vi: Event logged [
[  160.799874] IOTLB_INV_TIMEOUT device=0a:00.0 address=0x00043e8e8570]
[  160.799876] AMD-Vi: Event logged [
[  160.799878] IOTLB_INV_TIMEOUT device=0a:00.0 address=0x00043e8e8590]
[  161.801729] AMD-Vi: Event logged [
[  161.801732] IOTLB_INV_TIMEOUT device=0a:00.0 address=0x00043e8e85e0]
[  180.096365] AMD-Vi: Completion-Wait loop timed out
[  180.256758] AMD-Vi: Completion-Wait loop timed out
[  180.417182] AMD-Vi: Completion-Wait loop timed out
[  180.577636] AMD-Vi: Completion-Wait loop timed out

Can you try a v4.17-rc1 kernel?  Specifically, these two updates:

6bd06f5a486c vfio/type1: Adopt fast IOTLB flush interface when unmap IOVAs
eb5ecd1a40e2 iommu/amd: Add support for fast IOTLB flushing

Something about AMD GPUs get unhappy if the IOMMU sends out too many
invalidations and the above two patches can reduce the number of those
invalidations by up to a factor of 512.

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=6bd06f5a486c06023a618a86e8153b91d26f75f4
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=eb5ecd1a40e2098f805fb63cb07817ac48826e40

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 106111] [GPU Passthrough]GPU (Polaris) not reinitialized with Linux VM (Reset bug)

2018-04-17 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=106111

--- Comment #1 from Max  ---
Created attachment 13
  --> https://bugs.freedesktop.org/attachment.cgi?id=13=edit
dmesg output after to launch the VM a second time

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 106111] [GPU Passthrough]GPU (Polaris) not reinitialized with Linux VM (Reset bug)

2018-04-17 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=106111

Bug ID: 106111
   Summary: [GPU Passthrough]GPU (Polaris) not reinitialized with
Linux VM (Reset bug)
   Product: DRI
   Version: unspecified
  Hardware: x86-64 (AMD64)
OS: All
Status: NEW
  Severity: normal
  Priority: medium
 Component: DRM/AMDgpu
  Assignee: dri-devel@lists.freedesktop.org
  Reporter: berilli...@gmail.com
CC: alexdeuc...@gmail.com

Created attachment 138887
  --> https://bugs.freedesktop.org/attachment.cgi?id=138887=edit
xorg.conf

Hi,

My Setup :
- AMD Ryzen 1600
- 16 Gb Memory RAM
- Host (Debian Stable, kernel 4.16.2) : AMD Rx560 4Gb
- Guest (Windows 10 / Archlinux Kernel 4.15.x-4.16.x) : AMD Rx580 - 8Gb

Years ago there was an issue on Windows virtual machine with Qemu/VFIO and AMD
GPU. It was impossible to reboot or use a 2nde time the Guest because the GPU
was not reinitialized when the Host was shutdown. The only solution to re-use
the VM was to reboot the Host OR use a Nvidia GPU.

Actually, the issue is fixed on Windows VM + AMD GPU passed through (i don't
know how), i can use more times my VM without reboot the Host. 

But if i use my Linux VM with my Rx580, the issue still exist. The first launch
works, i can use the Rx580 to play without problem. But if i shutdown/reboot
the guest, the Rx580 is "blocked". I need to hard reboot because the system
hangs after ~2-3 minutes.

Thanks for your help,
Maxime 

(Sorry for my English, i'm French)

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel