I decided to disconnect the Nvidia GPU and use a spare AMD GPU and
haven't had this occur since. I guess you're probably right that it's a
hardware issue but don't know whether the fault was with the GPU or the
Motherboard. At the time this was happening I had the Nvidia GPU in the
first slot and the AMD GPU in the second PCIe slot but without any power
cables running to it from the PSU, so essentially it was off and not
being picked up by Ubuntu.

I don't know if the motherboard would be at fault or not in the scenario
above if the second slot has a device plugged in but not powered?
Initially it looks like the fault of the Nvidia GPU but I haven't tested
it in other configurations to definitively say it's a fault with the
Nvidia GPU.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2023585

Title:
  [nvidia] GPU has fallen off the bus

Status in linux package in Ubuntu:
  Incomplete
Status in nvidia-graphics-drivers-525 package in Ubuntu:
  New

Bug description:
  When playing Assassins Creed Unity through Steam, the game will run
  fine for a short period and then pretty quickly in my experience the
  screen will go blank, lights on the GPU will turn off and GPU fans
  will spin at max RPM.

  I checked the dmesg logs from that session and saw at the bottom:

  ```
  Jun 12 19:25:09 pikachu kernel: NVRM: GPU at PCI:0000:0b:00: 
GPU-f888943b-327b-82af-03dd-7c4213dc4788
  Jun 12 19:25:09 pikachu kernel: NVRM: Xid (PCI:0000:0b:00): 79, 
pid='<unknown>', name=<unknown>, GPU has fallen off the bus.
  Jun 12 19:25:09 pikachu kernel: NVRM: GPU 0000:0b:00.0: GPU has fallen off 
the bus.
  Jun 12 19:25:09 pikachu kernel: nvidia-gpu 0000:0b:00.3: Unable to change 
power state from D3hot to D0, device inaccessible
  Jun 12 19:25:09 pikachu kernel: xhci_hcd 0000:0b:00.2: Unable to change power 
state from D3hot to D0, device inaccessible
  Jun 12 19:25:09 pikachu kernel: xhci_hcd 0000:0b:00.2: Unable to change power 
state from D3cold to D0, device inaccessible
  Jun 12 19:25:09 pikachu kernel: xhci_hcd 0000:0b:00.2: Controller not ready 
at resume -19
  Jun 12 19:25:09 pikachu kernel: xhci_hcd 0000:0b:00.2: PCI post-resume error 
-19!
  Jun 12 19:25:09 pikachu kernel: xhci_hcd 0000:0b:00.2: HC died; cleaning up
  Jun 12 19:25:09 pikachu kernel: audit: type=1400 audit(1686594309.980:429): 
apparmor="DENIED" operation="open" class="file" 
profile="snap.keepassxc.keepassxc" name="/sys/devices/pci00>
  Jun 12 19:25:10 pikachu kernel: nvidia-gpu 0000:0b:00.3: i2c timeout error 
ffffffff
  Jun 12 19:25:10 pikachu kernel: ucsi_ccg 0-0008: i2c_transfer failed -110
  ```

  Further up in the logs I also see the following (in case it's
  related):

  ```
  [drm:nv_drm_master_set [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000b00] 
Failed to grab modeset ownership
  ```

  I am using an RTX 2080Ti on driver version 525.105.17.

  I have attached the full dmesg log

  ProblemType: Bug
  DistroRelease: Ubuntu 23.04
  Package: nvidia-driver-525 525.105.17-0ubuntu1
  ProcVersionSignature: Ubuntu 6.2.0-20.20-generic 6.2.6
  Uname: Linux 6.2.0-20-generic x86_64
  NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair nvidia_modeset 
nvidia
  ApportVersion: 2.26.1-0ubuntu2
  Architecture: amd64
  CasperMD5CheckResult: pass
  CurrentDesktop: ubuntu:GNOME
  Date: Mon Jun 12 19:35:37 2023
  InstallationDate: Installed on 2022-12-06 (187 days ago)
  InstallationMedia: Ubuntu 22.10 "Kinetic Kudu" - Release amd64 (20221020)
  SourcePackage: nvidia-graphics-drivers-525
  UpgradeStatus: Upgraded to lunar on 2023-04-21 (51 days ago)

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2023585/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to