[Bug 110637] Any OpenCL application causes "*ERROR* ring gfx timeout" on Vega 64

2019-09-25 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=110637

GitLab Migration User  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |MOVED

--- Comment #12 from GitLab Migration User  ---
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been
closed from further activity.

You can subscribe and participate further through the new bug through this link
to our GitLab instance: https://gitlab.freedesktop.org/mesa/mesa/issues/1402.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 110637] Any OpenCL application causes "*ERROR* ring gfx timeout" on Vega 64

2019-07-29 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=110637

--- Comment #11 from Alex Deucher  ---
(In reply to Alexander Mezin from comment #10)
> (In reply to Alex Deucher from comment #3)
> > More likely a bug in the mesa OpenCL code.  If you want functional OpenCL,
> > you should use the ROCm OpenCL packages.
> 
> Do you mean "Mesa OpenCL is not supported/unmaintained"?

Correct.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 110637] Any OpenCL application causes "*ERROR* ring gfx timeout" on Vega 64

2019-07-29 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=110637

--- Comment #10 from Alexander Mezin  ---
(In reply to Alex Deucher from comment #3)
> More likely a bug in the mesa OpenCL code.  If you want functional OpenCL,
> you should use the ROCm OpenCL packages.

Do you mean "Mesa OpenCL is not supported/unmaintained"?

I still can't get any OpenCL application to work (even "Hello World" examples).

Mesa 18.3.4, 19.0.x - GPU hangs then resets. But judging by power consumption
(hwmon, 70W - higher than usual idle power consumption) GPU continues to do
something even after reset

Mesa 19.1.x, git master - GPU doesn't hang but applications themselves hang on
the first clFinish. Power consumption stays higher than typical idle power
again.

Building ROCm from source is a huge pain.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 110637] Any OpenCL application causes "*ERROR* ring gfx timeout" on Vega 64

2019-05-25 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=110637

--- Comment #9 from Alexander Mezin  ---
Tried Mesa 19.1.0-rc3
Geekbench hangs, but there are no immediate errors in dmesg. It looks like gpu
is doing something based on 'sensors' output (~130 W power consumption, at idle
it is <20W). And power consumption doesn't go down even when I kill geekbench.
When I try to reboot, the system hangs.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 110637] Any OpenCL application causes "*ERROR* ring gfx timeout" on Vega 64

2019-05-09 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=110637

--- Comment #8 from Alexander Mezin  ---
(In reply to Jan Vesely from comment #6)
> if yes can you confirm if the games hang when running with
> OCL_ICD_VENDORS=/var/empty/ ?
> (alternatively, you can just move libMesaOpenCL.* out of library path)

No, setting OCL_ICD_VENDORS didn't change anything (though I'm not completely
sure that Steam and then Proton don't discard environment variables somewhere).

However, upgrading Mesa to 19.0.4 fixed game hangs. OpenCL issues are still
here.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 110637] Any OpenCL application causes "*ERROR* ring gfx timeout" on Vega 64

2019-05-09 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=110637

--- Comment #7 from Alexander Mezin  ---
(In reply to Jan Vesely from comment #6)
> Can you post the output of 'clinfo'?

Sure

Number of platforms   1
  Platform Name   Clover
  Platform Vendor Mesa
  Platform VersionOpenCL 1.1 Mesa 19.0.4
  Platform ProfileFULL_PROFILE
  Platform Extensions cl_khr_icd
  Platform Extensions function suffix MESA

  Platform Name   Clover
Number of devices 1
  Device Name Radeon RX Vega (VEGA10, DRM
3.30.0, 5.1.0-arch1-1-ARCH, LLVM 8.0.0)
  Device Vendor   AMD
  Device Vendor ID0x1002
  Device Version  OpenCL 1.1 Mesa 19.0.4
  Driver Version  19.0.4
  Device OpenCL C Version OpenCL C 1.1 
  Device Type GPU
  Device Profile  FULL_PROFILE
  Device AvailableYes
  Compiler Available  Yes
  Max compute units   64
  Max clock frequency 1630MHz
  Max work item dimensions3
  Max work item sizes 256x256x256
  Max work group size 256
  Preferred work group size multiple  64
  Preferred / native vector sizes 
char16 / 16  
short8 / 8   
int  4 / 4   
long 2 / 2   
half 8 / 8   
(cl_khr_fp16)
float4 / 4   
double   2 / 2   
(cl_khr_fp64)
  Half-precision Floating-point support   (cl_khr_fp16)
Denormals No
Infinity and NANs Yes
Round to nearest  Yes
Round to zero No
Round to infinity No
IEEE754-2008 fused multiply-add   No
Support is emulated in software   No
  Single-precision Floating-point support (core)
Denormals No
Infinity and NANs Yes
Round to nearest  Yes
Round to zero No
Round to infinity No
IEEE754-2008 fused multiply-add   No
Support is emulated in software   No
Correctly-rounded divide and sqrt operations  No
  Double-precision Floating-point support (cl_khr_fp64)
Denormals Yes
Infinity and NANs Yes
Round to nearest  Yes
Round to zero Yes
Round to infinity Yes
IEEE754-2008 fused multiply-add   Yes
Support is emulated in software   No
  Address bits64, Little-Endian
  Global memory size  8573157376 (7.984GiB)
  Error Correction supportNo
  Max memory allocation   6858525900 (6.387GiB)
  Unified memory for Host and Device  No
  Minimum alignment for any data type 128 bytes
  Alignment of base address   32768 bits (4096 bytes)
  Global Memory cache typeNone
  Image support   No
  Local memory type   Local
  Local memory size   32768 (32KiB)
  Max number of constant args 16
  Max constant buffer size2147483647 (2GiB)
  Max size of kernel argument 1024
  Queue properties
Out-of-order executionNo
Profiling Yes
  Profiling timer resolution  0ns
  Execution capabilities  
Run OpenCL kernelsYes
Run native kernelsNo
  Device Extensions   cl_khr_byte_addressable_store
cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics

[Bug 110637] Any OpenCL application causes "*ERROR* ring gfx timeout" on Vega 64

2019-05-09 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=110637

--- Comment #6 from Jan Vesely  ---
Can you post the output of 'clinfo'?
GPU hangs in clover are usually signs of old LLVM, or old mesa (that does not
catch function calls).

Do you use ocl-icd?
if yes can you confirm if the games hang when running with
OCL_ICD_VENDORS=/var/empty/ ?
(alternatively, you can just move libMesaOpenCL.* out of library path)



(In reply to Alex Deucher from comment #3)
> More likely a bug in the mesa OpenCL code.  If you want functional OpenCL,
> you should use the ROCm OpenCL packages.

I doubt that. clover uses the same LLVM code generation paths.
also note: "the same problem with multiple games", I doubt those use OpenCL.
the above steps should confirm that.
My guess is that compute shaders are busted (irrespective of the API).
GPU reset has never worked correctly on any AMD GPU that I've ever used.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 110637] Any OpenCL application causes "*ERROR* ring gfx timeout" on Vega 64

2019-05-09 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=110637

--- Comment #5 from Alexander Mezin  ---
And BTW with kernel 4.19.40 and latest git firmware
(https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=2579167548be33afb1fe2a9a5c141561ee5a8bbe)
monitors switch off on boot as soon as amdgpu driver loads and never turn on
again

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 110637] Any OpenCL application causes "*ERROR* ring gfx timeout" on Vega 64

2019-05-08 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=110637

--- Comment #3 from Alex Deucher  ---
More likely a bug in the mesa OpenCL code.  If you want functional OpenCL, you
should use the ROCm OpenCL packages.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 110637] Any OpenCL application causes "*ERROR* ring gfx timeout" on Vega 64

2019-05-08 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=110637

Alexander Mezin  changed:

   What|Removed |Added

Summary|Enabling OpenCL in  |Any OpenCL application
   |Libreoffice kills Vega 64   |causes "*ERROR* ring gfx
   ||timeout" on Vega 64

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel