[Bug 110637] Any OpenCL application causes "*ERROR* ring gfx timeout" on Vega 64
https://bugs.freedesktop.org/show_bug.cgi?id=110637 GitLab Migration User changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |MOVED --- Comment #12 from GitLab Migration User --- -- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/mesa/mesa/issues/1402. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 110637] Any OpenCL application causes "*ERROR* ring gfx timeout" on Vega 64
https://bugs.freedesktop.org/show_bug.cgi?id=110637 --- Comment #11 from Alex Deucher --- (In reply to Alexander Mezin from comment #10) > (In reply to Alex Deucher from comment #3) > > More likely a bug in the mesa OpenCL code. If you want functional OpenCL, > > you should use the ROCm OpenCL packages. > > Do you mean "Mesa OpenCL is not supported/unmaintained"? Correct. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 110637] Any OpenCL application causes "*ERROR* ring gfx timeout" on Vega 64
https://bugs.freedesktop.org/show_bug.cgi?id=110637 --- Comment #10 from Alexander Mezin --- (In reply to Alex Deucher from comment #3) > More likely a bug in the mesa OpenCL code. If you want functional OpenCL, > you should use the ROCm OpenCL packages. Do you mean "Mesa OpenCL is not supported/unmaintained"? I still can't get any OpenCL application to work (even "Hello World" examples). Mesa 18.3.4, 19.0.x - GPU hangs then resets. But judging by power consumption (hwmon, 70W - higher than usual idle power consumption) GPU continues to do something even after reset Mesa 19.1.x, git master - GPU doesn't hang but applications themselves hang on the first clFinish. Power consumption stays higher than typical idle power again. Building ROCm from source is a huge pain. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 110637] Any OpenCL application causes "*ERROR* ring gfx timeout" on Vega 64
https://bugs.freedesktop.org/show_bug.cgi?id=110637 --- Comment #9 from Alexander Mezin --- Tried Mesa 19.1.0-rc3 Geekbench hangs, but there are no immediate errors in dmesg. It looks like gpu is doing something based on 'sensors' output (~130 W power consumption, at idle it is <20W). And power consumption doesn't go down even when I kill geekbench. When I try to reboot, the system hangs. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 110637] Any OpenCL application causes "*ERROR* ring gfx timeout" on Vega 64
https://bugs.freedesktop.org/show_bug.cgi?id=110637 --- Comment #8 from Alexander Mezin --- (In reply to Jan Vesely from comment #6) > if yes can you confirm if the games hang when running with > OCL_ICD_VENDORS=/var/empty/ ? > (alternatively, you can just move libMesaOpenCL.* out of library path) No, setting OCL_ICD_VENDORS didn't change anything (though I'm not completely sure that Steam and then Proton don't discard environment variables somewhere). However, upgrading Mesa to 19.0.4 fixed game hangs. OpenCL issues are still here. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 110637] Any OpenCL application causes "*ERROR* ring gfx timeout" on Vega 64
https://bugs.freedesktop.org/show_bug.cgi?id=110637 --- Comment #7 from Alexander Mezin --- (In reply to Jan Vesely from comment #6) > Can you post the output of 'clinfo'? Sure Number of platforms 1 Platform Name Clover Platform Vendor Mesa Platform VersionOpenCL 1.1 Mesa 19.0.4 Platform ProfileFULL_PROFILE Platform Extensions cl_khr_icd Platform Extensions function suffix MESA Platform Name Clover Number of devices 1 Device Name Radeon RX Vega (VEGA10, DRM 3.30.0, 5.1.0-arch1-1-ARCH, LLVM 8.0.0) Device Vendor AMD Device Vendor ID0x1002 Device Version OpenCL 1.1 Mesa 19.0.4 Driver Version 19.0.4 Device OpenCL C Version OpenCL C 1.1 Device Type GPU Device Profile FULL_PROFILE Device AvailableYes Compiler Available Yes Max compute units 64 Max clock frequency 1630MHz Max work item dimensions3 Max work item sizes 256x256x256 Max work group size 256 Preferred work group size multiple 64 Preferred / native vector sizes char16 / 16 short8 / 8 int 4 / 4 long 2 / 2 half 8 / 8 (cl_khr_fp16) float4 / 4 double 2 / 2 (cl_khr_fp64) Half-precision Floating-point support (cl_khr_fp16) Denormals No Infinity and NANs Yes Round to nearest Yes Round to zero No Round to infinity No IEEE754-2008 fused multiply-add No Support is emulated in software No Single-precision Floating-point support (core) Denormals No Infinity and NANs Yes Round to nearest Yes Round to zero No Round to infinity No IEEE754-2008 fused multiply-add No Support is emulated in software No Correctly-rounded divide and sqrt operations No Double-precision Floating-point support (cl_khr_fp64) Denormals Yes Infinity and NANs Yes Round to nearest Yes Round to zero Yes Round to infinity Yes IEEE754-2008 fused multiply-add Yes Support is emulated in software No Address bits64, Little-Endian Global memory size 8573157376 (7.984GiB) Error Correction supportNo Max memory allocation 6858525900 (6.387GiB) Unified memory for Host and Device No Minimum alignment for any data type 128 bytes Alignment of base address 32768 bits (4096 bytes) Global Memory cache typeNone Image support No Local memory type Local Local memory size 32768 (32KiB) Max number of constant args 16 Max constant buffer size2147483647 (2GiB) Max size of kernel argument 1024 Queue properties Out-of-order executionNo Profiling Yes Profiling timer resolution 0ns Execution capabilities Run OpenCL kernelsYes Run native kernelsNo Device Extensions cl_khr_byte_addressable_store cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics
[Bug 110637] Any OpenCL application causes "*ERROR* ring gfx timeout" on Vega 64
https://bugs.freedesktop.org/show_bug.cgi?id=110637 --- Comment #6 from Jan Vesely --- Can you post the output of 'clinfo'? GPU hangs in clover are usually signs of old LLVM, or old mesa (that does not catch function calls). Do you use ocl-icd? if yes can you confirm if the games hang when running with OCL_ICD_VENDORS=/var/empty/ ? (alternatively, you can just move libMesaOpenCL.* out of library path) (In reply to Alex Deucher from comment #3) > More likely a bug in the mesa OpenCL code. If you want functional OpenCL, > you should use the ROCm OpenCL packages. I doubt that. clover uses the same LLVM code generation paths. also note: "the same problem with multiple games", I doubt those use OpenCL. the above steps should confirm that. My guess is that compute shaders are busted (irrespective of the API). GPU reset has never worked correctly on any AMD GPU that I've ever used. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 110637] Any OpenCL application causes "*ERROR* ring gfx timeout" on Vega 64
https://bugs.freedesktop.org/show_bug.cgi?id=110637 --- Comment #5 from Alexander Mezin --- And BTW with kernel 4.19.40 and latest git firmware (https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=2579167548be33afb1fe2a9a5c141561ee5a8bbe) monitors switch off on boot as soon as amdgpu driver loads and never turn on again -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 110637] Any OpenCL application causes "*ERROR* ring gfx timeout" on Vega 64
https://bugs.freedesktop.org/show_bug.cgi?id=110637 --- Comment #3 from Alex Deucher --- More likely a bug in the mesa OpenCL code. If you want functional OpenCL, you should use the ROCm OpenCL packages. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 110637] Any OpenCL application causes "*ERROR* ring gfx timeout" on Vega 64
https://bugs.freedesktop.org/show_bug.cgi?id=110637 Alexander Mezin changed: What|Removed |Added Summary|Enabling OpenCL in |Any OpenCL application |Libreoffice kills Vega 64 |causes "*ERROR* ring gfx ||timeout" on Vega 64 -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel