Re: [Mesa-dev] [PATCH 1/5] clover/memory: Copy data when creating buffers with CL_MEM_USE_HOST_PTR

2017-08-04 Thread Grigori Goronzy
On 2017-08-03 22:26, Alex Deucher wrote: IIRC, user_ptrs require page alignment. Alex I didn't follow the whole discussion (sorry if I'm saying something redundant), but AMD's older OpenCL Optimization Guide [1] has some notes regarding the implementation of the USE_HOST_PTR flag. It

Re: [Mesa-dev] [PATCH 1/2] glx: add support for GLX_ARB_create_context_no_error

2017-08-03 Thread Grigori Goronzy
Hi, there also is a patch needed to make this work for Xorg on the xorg-devel list as well as preliminary piglit test to verify the functionality on the piglit list. Grigori On 2017-08-03 20:07, Grigori Goronzy wrote: --- src/glx/dri2_glx.c | 12 src/glx/dri3_glx.c

[Mesa-dev] [PATCH 2/2] st/glx: add support for GLX_ARB_create_context_no_error

2017-08-03 Thread Grigori Goronzy
--- src/gallium/state_trackers/glx/xlib/glx_api.c | 55 --- src/gallium/state_trackers/glx/xlib/xm_api.c | 6 ++- src/gallium/state_trackers/glx/xlib/xm_api.h | 4 +- 3 files changed, 57 insertions(+), 8 deletions(-) diff --git

[Mesa-dev] [PATCH 1/2] glx: add support for GLX_ARB_create_context_no_error

2017-08-03 Thread Grigori Goronzy
--- src/glx/dri2_glx.c | 12 src/glx/dri3_glx.c | 8 src/glx/dri_common.c| 52 - src/glx/dri_common.h| 5 + src/glx/drisw_glx.c | 3 +++ src/glx/glxclient.h | 6 ++ src/glx/glxextensions.c

Re: [Mesa-dev] [PATCH] egl: fix check for KHR_no_error vs debug/robustness

2017-07-26 Thread Grigori Goronzy
On 2017-07-19 23:51, Grigori Goronzy wrote: The check is too aggressive and might also fail if context flags appear after the no-error attribute in the context attribute list. Delay the check to after attribute parsing to fix this. --- This was found by the piglit test I just sent to the piglit

Re: [Mesa-dev] [PATCH] dri: Make classic drivers allow __DRI_CTX_FLAG_NO_ERROR.

2017-07-20 Thread Grigori Goronzy
On 2017-07-18 20:25, Ian Romanick wrote: On 07/14/2017 04:10 PM, Kenneth Graunke wrote: Grigori recently added EGL_KHR_create_context_no_error support, which causes EGL to pass a new __DRI_CTX_FLAG_NO_ERROR flag to drivers when requesting an appropriate context mode. driContextSetFlags() will

[Mesa-dev] [PATCH] egl: fix check for KHR_no_error vs debug/robustness

2017-07-19 Thread Grigori Goronzy
The check is too aggressive and might also fail if context flags appear after the no-error attribute in the context attribute list. Delay the check to after attribute parsing to fix this. --- This was found by the piglit test I just sent to the piglit ML. I promise, next time I'll write tests

Re: [Mesa-dev] [PATCH] dri: Make classic drivers allow __DRI_CTX_FLAG_NO_ERROR.

2017-07-18 Thread Grigori Goronzy
On 2017-07-18 20:25, Ian Romanick wrote: On 07/14/2017 04:10 PM, Kenneth Graunke wrote: Grigori recently added EGL_KHR_create_context_no_error support, which causes EGL to pass a new __DRI_CTX_FLAG_NO_ERROR flag to drivers when requesting an appropriate context mode. driContextSetFlags() will

Re: [Mesa-dev] [PATCH 4/4] dri: Add KHR_no_error toggle to driconf

2017-07-18 Thread Grigori Goronzy
On 2017-07-17 19:21, Emil Velikov wrote: On 13 July 2017 at 12:09, Grigori Goronzy <g...@chown.ath.cx> wrote: On 2017-07-12 15:15, Emil Velikov wrote: As mentioned in earlier commit no_error should be device agnostic. Hence removing the st/dri bits and adding a DRI_CONF_MESA_NO_ERROR(

Re: [Mesa-dev] [PATCH] dri: Make classic drivers allow __DRI_CTX_FLAG_NO_ERROR.

2017-07-14 Thread Grigori Goronzy
, but the classic drivers all have code to explicitly balk at unknown flags. We need to let it through or they'll fail to create a no_error context. I can't test it, but LGTM, so: Reviewed-by: Grigori Goronzy <g...@chown.ath.cx> --- src/mesa/drivers/dri/i915/intel_screen.c | 2 +- src/mesa/d

Re: [Mesa-dev] [PATCH] egl: Fix predecence problem when setting __DRI_CTX_FLAG_NO_ERROR

2017-07-14 Thread Grigori Goronzy
On 2017-07-14 23:30, Kenneth Graunke wrote: This accidentally set __DRI_CTX_FLAG_NO_ERROR whenever any flags were present. Just needs extra parenthesis. Fixes: 4909519a6655 (egl: Add EGL_KHR_create_context_no_error support) Reviewed-by: Grigori Goronzy <g...@chown.ath.cx> Sorry for br

[Mesa-dev] [PATCH] mesa/marshal: fix Windows build

2017-07-14 Thread Grigori Goronzy
This was broken by commit 1ad24faa. --- src/mesa/main/marshal.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/src/mesa/main/marshal.h b/src/mesa/main/marshal.h index f2dc842..63e0295 100644 --- a/src/mesa/main/marshal.h +++ b/src/mesa/main/marshal.h @@ -257,7 +257,7 @@

[Mesa-dev] [PATCH v2 1/4] dri: Add KHR_no_error DRI extension

2017-07-13 Thread Grigori Goronzy
This basic extension allows usage of the __DRI_CTX_FLAG_NO_ERROR flag. This includes support code for classic Mesa drivers to switch on the no-error mode if the flag is set. v2: Move to common DRI code. --- include/GL/internal/dri_interface.h | 19 +++

[Mesa-dev] [PATCH v2 3/4] egl: Add EGL_KHR_create_context_no_error support

2017-07-13 Thread Grigori Goronzy
This only adds the EGL side, needs to be plumbed into Mesa frontend. v2: Add check for extension availability. --- src/egl/drivers/dri2/egl_dri2.c | 20 ++-- src/egl/drivers/dri2/egl_dri2.h | 1 + src/egl/main/eglapi.c | 1 + src/egl/main/eglcontext.c | 31

[Mesa-dev] [PATCH v2 2/4] st/mesa: add support for KHR_no_error flag

2017-07-13 Thread Grigori Goronzy
Add a new context flag and plumb it through the various layers of the context creation code to set up dispatch tables for the no-error mode. --- src/gallium/include/state_tracker/st_api.h | 1 + src/gallium/state_trackers/dri/dri_context.c | 3 +++ src/mesa/state_tracker/st_context.c

[Mesa-dev] [PATCH v2 4/4] st/mesa: Add KHR_no_error toggle to driconf

2017-07-13 Thread Grigori Goronzy
Allows applications to be whitelisted. v2: Remove misguided DRI common part. --- src/gallium/state_trackers/dri/dri_context.c| 3 +++ src/gallium/state_trackers/dri/dri_screen.c | 1 + src/mesa/drivers/dri/common/xmlpool/t_options.h | 5 + 3 files changed, 9 insertions(+) diff --git

Re: [Mesa-dev] [PATCH 4/4] dri: Add KHR_no_error toggle to driconf

2017-07-13 Thread Grigori Goronzy
On 2017-07-12 15:15, Emil Velikov wrote: As mentioned in earlier commit no_error should be device agnostic. Hence removing the st/dri bits and adding a DRI_CONF_MESA_NO_ERROR() line next to DRI_CONF_VBLANK_MODE seems like the better solution. Hm, driconf overrides are typically set per screen

Re: [Mesa-dev] [PATCH 3/4] st/mesa: add support for KHR_no_error flag

2017-07-12 Thread Grigori Goronzy
On 2017-07-12 15:08, Emil Velikov wrote: On 11 July 2017 at 23:26, Grigori Goronzy <g...@chown.ath.cx> wrote: Add a new context flag and plumb it through the various layers of the context creation code to set up dispatch tables for the no-error mode. --- src/gallium/include/state_t

Re: [Mesa-dev] KHR_no_error improvements

2017-07-12 Thread Grigori Goronzy
On 2017-07-12 15:16, Emil Velikov wrote: On 11 July 2017 at 23:26, Grigori Goronzy <g...@chown.ath.cx> wrote: Hi, this series implements support for the EGL_KHR_context_create_no error extension and the associated plumbing through the different layers of Mesa - EGL, DRI, Gallium state t

Re: [Mesa-dev] [PATCH 1/4] egl: Add EGL_KHR_create_context_no_error support

2017-07-12 Thread Grigori Goronzy
On 2017-07-12 12:33, Eric Engestrom wrote: + case EGL_CONTEXT_OPENGL_NO_ERROR_KHR: + if (dpy->Version < 14) { +err = EGL_BAD_ATTRIBUTE; +break; + } + + /* The KHR_no_error spec only applies against OpenGL 2.0+ and + * OpenGL ES 2.0+

[Mesa-dev] [PATCH 2/4] dri: Add KHR_no_error DRI extension

2017-07-11 Thread Grigori Goronzy
This basic extension allows usage of the __DRI_CTX_FLAG_NO_ERROR flag. This includes support code for classic Mesa drivers to switch on the no-error mode if the flag is set. --- include/GL/internal/dri_interface.h | 19 +++ src/gallium/state_trackers/dri/dri2.c|

[Mesa-dev] [PATCH 1/4] egl: Add EGL_KHR_create_context_no_error support

2017-07-11 Thread Grigori Goronzy
This only adds the EGL side, needs to be plumbed into Mesa frontend. --- src/egl/drivers/dri2/egl_dri2.c | 20 ++-- src/egl/drivers/dri2/egl_dri2.h | 1 + src/egl/main/eglapi.c | 1 + src/egl/main/eglcontext.c | 30 ++

[Mesa-dev] KHR_no_error improvements

2017-07-11 Thread Grigori Goronzy
Hi, this series implements support for the EGL_KHR_context_create_no error extension and the associated plumbing through the different layers of Mesa - EGL, DRI, Gallium state tracker, Mesa frontend. It took me a while to figure out how everything is connected together and still it's somewhat

[Mesa-dev] [PATCH 3/4] st/mesa: add support for KHR_no_error flag

2017-07-11 Thread Grigori Goronzy
Add a new context flag and plumb it through the various layers of the context creation code to set up dispatch tables for the no-error mode. --- src/gallium/include/state_tracker/st_api.h | 1 + src/gallium/state_trackers/dri/dri_context.c | 3 +++ src/mesa/state_tracker/st_context.c

[Mesa-dev] [PATCH 4/4] dri: Add KHR_no_error toggle to driconf

2017-07-11 Thread Grigori Goronzy
Allows applications to be whitelisted. --- src/gallium/state_trackers/dri/dri_context.c| 3 +++ src/gallium/state_trackers/dri/dri_screen.c | 1 + src/mesa/drivers/dri/common/dri_util.c | 3 +++ src/mesa/drivers/dri/common/xmlpool/t_options.h | 5 + 4 files changed, 12

[Mesa-dev] [PATCH] mesa/marshal: fix glNamedBufferData with NULL data

2017-07-10 Thread Grigori Goronzy
The semantics are similar to glBufferData. Fixes a crash with VMWare Player. Signed-off-by: Grigori Goronzy <g...@chown.ath.cx> --- src/mesa/main/marshal.c | 17 + 1 file changed, 13 insertions(+), 4 deletions(-) diff --git a/src/mesa/main/marshal.c b/src/mesa/main/mar

Re: [Mesa-dev] [PATCH] mesa/marshal: add custom marshallingforglNamedBuffer(Sub)Data

2017-07-09 Thread Grigori Goronzy
On 2017-06-26 15:51, Marc Dietrich wrote: Am Montag, 26. Juni 2017, 15:35:15 CEST schrieb Grigori Goronzy: On 2017-06-26 15:11, Marc Dietrich wrote: > unfortunately, this change broke vmware/vmplayer here (bisected). > Windows > guest on linux host. Sig 11 in SVGA driver.

Re: [Mesa-dev] [PATCH 1/2] mesa/marshal: extract ClearBuffer helpers

2017-07-09 Thread Grigori Goronzy
On 2017-07-09 18:52, Matt Turner wrote: +static inline size_t buffer_to_size(GLenum buffer) +{ + switch (buffer) { + case GL_COLOR: + return 4; + case GL_DEPTH_STENCIL: + return 2; + case GL_STENCIL: + case GL_DEPTH: + return 1; + default: + return 0; + } +} +

[Mesa-dev] [PATCH 2/2] mesa/marshal: add marshalling for glClearBuffer*

2017-07-09 Thread Grigori Goronzy
Add async marshalling/unmarshalling for all glClearBuffer variants. These entry points are commonly used in general and Alien Isolation specifically uses glClearBufferiv. Slightly reduces the number of thread synchronizations with glthread in that game. --- src/mapi/glapi/gen/GL3x.xml | 6 +-

[Mesa-dev] [PATCH 1/2] mesa/marshal: extract ClearBuffer helpers

2017-07-09 Thread Grigori Goronzy
Extract clear buffer helper functions in preparation for adding marshal/unmarshal functions for the various glClearBuffer variants. --- src/mesa/main/marshal.c | 74 +++-- src/mesa/main/marshal.h | 5 ++-- 2 files changed, 50 insertions(+), 29

Re: [Mesa-dev] [PATCH] glthread: get rid of unmarshal dispatch enum/table

2017-07-07 Thread Grigori Goronzy
the switch/case block into an efficient jump table with the ID method, so an array for function lookup instead of that doesn't improve anything. I didn't see any measurable benefit of the function pointer method either. Best regards Grigori On Fri, Jun 30, 2017 at 7:14 PM, Grigori Goronzy

Re: [Mesa-dev] [PATCH] glthread: get rid of unmarshal dispatch enum/table

2017-06-30 Thread Grigori Goronzy
On 2017-06-30 15:27, Nicolai Hähnle wrote: On 30.06.2017 02:29, Grigori Goronzy wrote: Use function pointers to identify the unmarshalling function, which is simpler and gets rid of a lot generated code. This removes an indirection and possibly results in a slight speedup as well. The fact

[Mesa-dev] [PATCH] glthread: get rid of unmarshal dispatch enum/table

2017-06-29 Thread Grigori Goronzy
Use function pointers to identify the unmarshalling function, which is simpler and gets rid of a lot generated code. This removes an indirection and possibly results in a slight speedup as well. --- src/mapi/glapi/gen/Makefile.am | 4 -- src/mapi/glapi/gen/gl_marshal.py | 36

Re: [Mesa-dev] [PATCH] mesa/marshal: add custom marshalling forglNamedBuffer(Sub)Data

2017-06-26 Thread Grigori Goronzy
don't really get it, by the way. Isn't the SVGA driver for Linux guests? Best regards Grigori > Best regards > Grigori > >> [1] >> https://lists.freedesktop.org/archives/mesa-dev/2017-June/160329.html >> >> On 25/06/17 02:59, Grigori Goronzy wrote: >>

Re: [Mesa-dev] [PATCH] radeonsi: enable LLVM sisched for Unigine Superposition

2017-06-25 Thread Grigori Goronzy
On 2017-06-22 17:10, Marek Olšák wrote: From: Marek Olšák +2.3% better score on Fiji. It might be better without HBM. Is this really useful? Superposition is a benchmark. It would make more sense if this also targeted some actual games. Optimizations specific to only

Re: [Mesa-dev] [PATCH] mesa/marshal: add custom marshalling for glNamedBuffer(Sub)Data

2017-06-25 Thread Grigori Goronzy
surprise me if it is in the 40-50% region with both, though. Best regards Grigori [1] https://lists.freedesktop.org/archives/mesa-dev/2017-June/160329.html On 25/06/17 02:59, Grigori Goronzy wrote: These entry points are used by Alien Isolation and caused synchronization with glthread

[Mesa-dev] [PATCH] mesa/marshal: add custom marshalling for glNamedBuffer(Sub)Data

2017-06-24 Thread Grigori Goronzy
These entry points are used by Alien Isolation and caused synchronization with glthread. The async marshalling implementation is similar to glBuffer(Sub)Data. Results in an approximately 6x drop in glthread synchronizations and a ~30% FPS jump in Alien Isolation (Medium preset, Athlon 860K, RX

Re: [Mesa-dev] [PATCH] radeonsi: don't emit partial flushes at the end of IBs (v2)

2017-06-23 Thread Grigori Goronzy
On 2017-06-23 13:48, Andy Furniss wrote: Marek Olšák wrote: From: Marek Olšák The kernel sort of does the same thing with fences. v2: do emit partial flushes on SI Bugzilla seems to be down currently so replying here. On R9 285 with current agd5f 4.13-wip kernel I get

Re: [Mesa-dev] [PATCH 2/4] util/disk_cache: compress individual cache entries

2017-03-02 Thread Grigori Goronzy
e a better compromise, particularly for systems with a slow CPU. Apart from that, consider the series Reviewed-by: Grigori Goronzy <g...@chown.ath.cx> Best regards Grigori Am Donnerstag, 2. März 2017, 03:20:05 CET schrieb Matt Turner: On Wed, Mar 1, 2017 at 2:19 PM, Timothy Arceri <ta

Re: [Mesa-dev] Mesa 12.1.0 release plan (Was Re: Next Mesa release, anyone?)

2016-10-19 Thread Grigori Goronzy
On 2016-10-04 12:32, Emil Velikov wrote: On 2 October 2016 at 14:17, Axel Davy wrote: I'd prefer myself Oct 14, because we have a lot of patches for nine, and they deserve more cleaning and testing, but if it's Oct 7, we'll try be on time. 14th it is. As mentioned before:

[Mesa-dev] [PATCH 1/3] radv: add missing unreachable

2016-10-11 Thread Grigori Goronzy
--- src/amd/vulkan/radv_descriptor_set.c | 1 + 1 file changed, 1 insertion(+) diff --git a/src/amd/vulkan/radv_descriptor_set.c b/src/amd/vulkan/radv_descriptor_set.c index d1d2b1f..ba8a002 100644 --- a/src/amd/vulkan/radv_descriptor_set.c +++ b/src/amd/vulkan/radv_descriptor_set.c @@ -113,6

[Mesa-dev] [PATCH 3/3] radv: fix strict aliasing violation

2016-10-11 Thread Grigori Goronzy
--- src/amd/vulkan/radv_pipeline_cache.c | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/src/amd/vulkan/radv_pipeline_cache.c b/src/amd/vulkan/radv_pipeline_cache.c index 032a7e4..85a2b6d 100644 --- a/src/amd/vulkan/radv_pipeline_cache.c +++

[Mesa-dev] [PATCH 2/3] radv: fix uninitialized variables

2016-10-11 Thread Grigori Goronzy
This gets rid of "may be used uninitialized" compiler warnings. --- src/amd/vulkan/radv_formats.c | 2 +- src/amd/vulkan/radv_pipeline.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/src/amd/vulkan/radv_formats.c b/src/amd/vulkan/radv_formats.c index 90c140c..76d5fa1

Re: [Mesa-dev] [PATCH 1/2] vl: add a bicubic interpolation filter(v4)

2016-06-28 Thread Grigori Goronzy
On 2016-06-28 11:25, Nayan Deshmukh wrote: This is a shader based bicubic interpolater which uses cubic Hermite spline algorithm. v2: set dst_area and dst_clip during scaling (Christian) v3: clear the render target before rendering v4: intialize offsets while initializing shaders use a

Re: [Mesa-dev] [PATCH] radeon/uvd: fix the H264 level for Tonga

2016-05-30 Thread Grigori Goronzy
On 2016-05-27 15:16, Emil Velikov wrote: The odd things is that VLC uses/used to? check that information before feeding the video to the decoder, while others implementations (like the original one in mplayer done by the Nvidia devs) do/did? not bother. Many files either have an incorrect

Re: [Mesa-dev] [PATCH 1/2] winsys/amdgpu: adjust IB size based on buffer wait time

2016-04-20 Thread Grigori Goronzy
any calls into the kernel, right? The winsys code makes that conditional and calls into the kernel when no fence pointer is available. Grigori On 19.04.2016 18:13, Grigori Goronzy wrote: Small IBs help to reduce stalls for workloads that require a lot of synchronization. On the other hand

[Mesa-dev] [PATCH 2/2] winsys/amdgpu: clean up and fix switch statement

2016-04-19 Thread Grigori Goronzy
Add missing break, add default case. Additionally initialize variables to avoid compiler warnings. --- src/gallium/winsys/amdgpu/drm/amdgpu_cs.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/src/gallium/winsys/amdgpu/drm/amdgpu_cs.c

[Mesa-dev] [PATCH 1/2] winsys/amdgpu: adjust IB size based on buffer wait time

2016-04-19 Thread Grigori Goronzy
Small IBs help to reduce stalls for workloads that require a lot of synchronization. On the other hand, if there is no notable synchronization, we can use a large IB size to slightly improve performance in some cases. This introduces tuning of the IB size based on feedback on the average buffer

Re: [Mesa-dev] [RFC] dynamic IB size tuning for radeonsi

2016-04-17 Thread Grigori Goronzy
Interesting, and thanks for poking at this issue. I've been thinking about tuning IB sizes as well. I'd like for us to get this right, so I wonder: What's your theory for _why_ your change helps? See below. I think you discovered it yourself. I'll be honest with you: Right now, I think your

Re: [Mesa-dev] [PATCH 1/4] gallium/radeon: add clear_texture function

2016-04-16 Thread Grigori Goronzy
On 2016-04-15 20:30, Jakob Sinclair wrote: In other places in radeonsi that require reinterpretation (e.g. si_blit.c), the surface template is modified instead of changing the surface after creation. I'm not sure if r600/radeonsi like it if the format is changed late like here. Seems to be

[Mesa-dev] [RFC] dynamic IB size tuning for radeonsi

2016-04-15 Thread Grigori Goronzy
Hi, apps that cause a lot of synchronization benefit from small IB sizes. The current IB size is a bit on the large side for this class of apps. On the other hand, if there isn't much synchronization going on, increasing the IB size can slightly improve performance, too. Here's a quick hack that

[Mesa-dev] [PATCH] amdgpu/winsys: adjust IB size based on buffer wait time

2016-04-15 Thread Grigori Goronzy
Small IBs help to reduce stalls for workloads that require a lot of synchronization. On the other hand, if there is no notable synchronization, we can use a large IB size to slightly improve performance in some cases. This introduces tuning of the IB size based on feedback on the average buffer

Re: [Mesa-dev] [PATCH 1/4] gallium/radeon: add clear_texture function

2016-04-15 Thread Grigori Goronzy
On 2016-04-15 18:38, Ilia Mirkin wrote: + } else { + union pipe_color_union color; + switch (util_format_get_blocksizebits(res->format)) { + case 128: + sf->format = PIPE_FORMAT_R32G32B32A32_UINT; Just as an FYI... this is

Re: [Mesa-dev] [PATCH] radeonsi: fix mask checking when emitting scissors and viewports

2016-04-11 Thread Grigori Goronzy
issor_enable; /* The simple case: Only 1 viewport is active. */ - if (mask & 1 && - !si_get_vs_info(sctx)->writes_viewport_index) { + if (!si_get_vs_info(sctx)->writes_viewport_index) { + if (!(mask & 1)) + return; +

Re: [Mesa-dev] [PATCH 0/5] R600, GCN: Guard Band support

2016-04-11 Thread Grigori Goronzy
ssor & viewport code is deleted. Thanks for implementing this properly. Reviewed-by: Grigori Goronzy <g...@chown.ath.cx> Grigori ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2] radeonsi: use guard band clipping

2016-04-06 Thread Grigori Goronzy
With the previous changes to handling of viewport clipping, it is almost trivial to add proper support for guard band clipping. Select a suitable integer clipping value to keep inside the rasterizer's guard band range of [-32768, 32767] and program the hardware to use guard band clipping. Guard

[Mesa-dev] [PATCH 2/2] radeonsi: use guard band clipping

2016-04-06 Thread Grigori Goronzy
With the previous changes to handling of viewport clipping, it is almost trivial to add proper support for guard band clipping. Select a suitable integer clipping value to keep inside the rasterizer's guard band range of [-32768, 32767] and program the hardware to use guard band clipping. Guard

[Mesa-dev] [PATCH 1/2] radeonsi: do per-pixel clipping based on viewport states

2016-04-06 Thread Grigori Goronzy
From: Marek Olšák In other words, vport scissors are derived from viewport states. If the scissor test is enabled, the intersection of both is used. The guard band will disable clipping, so we have to clip per-pixel. v2: fix check for r600_draw_rectangle and other overflow

Re: [Mesa-dev] [PATCH 2/2] radeonsi: use re-Z

2016-02-24 Thread Grigori Goronzy
On 2016-02-23 17:45, Marek Olšák wrote: From: Marek Olšák This can increase perf for shaders that kill pixels (kill, alpha-test, alpha-to-coverage). --- src/gallium/drivers/radeonsi/si_shader.h| 1 + src/gallium/drivers/radeonsi/si_state.c | 6 +++---

Re: [Mesa-dev] [PATCH 2/2] radeonsi: use re-Z

2016-02-24 Thread Grigori Goronzy
On 2016-02-24 12:47, Marek Olšák wrote: On Wed, Feb 24, 2016 at 12:22 PM, Grigori Goronzy <g...@chown.ath.cx> wrote: S_00B32C_SCRATCH_EN(shader->config.scratch_bytes_per_wave > 0)); + + /* Prefer RE_Z if the shader is complex enough. */ + if (info->num_memory_in

Re: [Mesa-dev] [PATCH 2/2] radeon/uvd: fix VC-1 simple/main profile decode

2015-09-23 Thread Grigori Goronzy
Hi, On 23.09.2015 10:11, Christian König wrote: > From: Boyuan Zhang > > Signed-off-by: Boyuan Zhang > Reviewed-by: Christian König > --- Thanks, nice to see this finally getting fixed, and it was a pretty simple thing

Re: [Mesa-dev] [PATCH 1/2] clover: fix event handling of buffer operations

2015-06-25 Thread Grigori Goronzy
On 2015-06-09 22:52, Francisco Jerez wrote: + + if (blocking) + hev().wait(); + hard_event::wait() may fail, so this should probably be done before the ret_object() call to avoid leaks. Alright... C++ exceptions are a minefield. :) Is there any reason you didn't make the same change

Re: [Mesa-dev] [PATCH 2/2] clover: implement CL_KERNEL_PREFERRED_WORK_GROUP_SIZE_MULTIPLE

2015-06-25 Thread Grigori Goronzy
On 2015-05-28 13:04, Grigori Goronzy wrote: Work-group size should always be aligned to subgroup size; this is a basic requirement, otherwise some work-items will be no-operation. It might make sense to refine the value according to a kernel's resource usage, but that's a possible optimization

Re: [Mesa-dev] [PATCH 1/2] gallium: add PIPE_COMPUTE_CAP_SUBGROUP_SIZE

2015-06-04 Thread Grigori Goronzy
On 28.05.2015 13:04, Grigori Goronzy wrote: We need this to implement OpenCL's CL_KERNEL_PREFERRED_WORK_GROUP_SIZE_MULTIPLE. --- Ping? src/gallium/docs/source/screen.rst | 2 ++ src/gallium/drivers/ilo/ilo_screen.c | 8 src/gallium/drivers/nouveau/nvc0

Re: [Mesa-dev] [PATCH 1/2] clover: fix event handling of buffer operations

2015-06-04 Thread Grigori Goronzy
On 28.05.2015 10:10, Grigori Goronzy wrote: Wrap MapBuffer and MapImage as hard_event actions, like other operations. This enables correct profiling. Also make sure to wait for events to finish when blocking is requested by the caller. --- Ping? src/gallium/state_trackers/clover/api

[Mesa-dev] [PATCH 1/2] clover: fix event handling of buffer operations

2015-05-28 Thread Grigori Goronzy
Wrap MapBuffer and MapImage as hard_event actions, like other operations. This enables correct profiling. Also make sure to wait for events to finish when blocking is requested by the caller. --- src/gallium/state_trackers/clover/api/transfer.cpp | 50 -- 1 file changed, 46

[Mesa-dev] [PATCH 2/2] clover: check clEnqueueMap* for map errors

2015-05-28 Thread Grigori Goronzy
Mapping can fail, and this should be handled. Return the proper error code and abort the associated event in this case. --- src/gallium/state_trackers/clover/api/transfer.cpp | 16 ++-- 1 file changed, 14 insertions(+), 2 deletions(-) diff --git

[Mesa-dev] [PATCH 1/2] gallium: add PIPE_COMPUTE_CAP_SUBGROUP_SIZE

2015-05-28 Thread Grigori Goronzy
We need this to implement OpenCL's CL_KERNEL_PREFERRED_WORK_GROUP_SIZE_MULTIPLE. --- src/gallium/docs/source/screen.rst | 2 ++ src/gallium/drivers/ilo/ilo_screen.c | 8 src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 4

[Mesa-dev] [PATCH 2/2] clover: implement CL_KERNEL_PREFERRED_WORK_GROUP_SIZE_MULTIPLE

2015-05-28 Thread Grigori Goronzy
Work-group size should always be aligned to subgroup size; this is a basic requirement, otherwise some work-items will be no-operation. It might make sense to refine the value according to a kernel's resource usage, but that's a possible optimization for the future. ---

Re: [Mesa-dev] [PATCH 2/2] radeonsi: Add CIK SDMA support

2015-05-26 Thread Grigori Goronzy
the same issues as SI? We should really try to figure out what's wrong with tiled DMA copies. Anyway, Reviewed-by: Grigori Goronzy g...@chown.ath.cx Signed-off-by: Michel Dänzer michel.daen...@amd.com --- src/gallium/drivers/radeonsi/Makefile.sources | 1 + src/gallium/drivers/radeonsi

Re: [Mesa-dev] [PATCH 2/2] clover: try userptr for CL_MEM_USE_HOST_PTR

2015-05-23 Thread Grigori Goronzy
On 23.05.2015 15:53, Francisco Jerez wrote: diff --git a/src/gallium/state_trackers/clover/core/resource.cpp b/src/gallium/state_trackers/clover/core/resource.cpp index 8ed4c42..8e51b3c 100644 --- a/src/gallium/state_trackers/clover/core/resource.cpp +++

[Mesa-dev] [PATCH 1/2] clover: implement CL_MEM_ALLOC_HOST_PTR

2015-05-19 Thread Grigori Goronzy
This flag is typically used to request pinned host memory, to avoid any copies between GPU and CPU. This improves throughput with an older OpenCL app which I unfortunately can't publish due to its licensing. --- src/gallium/state_trackers/clover/core/resource.cpp | 4 1 file changed, 4

[Mesa-dev] [PATCH 2/2] clover: try userptr for CL_MEM_USE_HOST_PTR

2015-05-19 Thread Grigori Goronzy
According to spec, CL_MEM_USE_HOST_PTR should directly use host memory, if possible. This is just what userptr is for, so use it. In case the memory cannot be mapped, a fallback similar to CL_MEM_COPY_HOST_PTR is used. --- src/gallium/state_trackers/clover/core/memory.cpp | 2 +-

Re: [Mesa-dev] [PATCH] Revert radeon/llvm: enable unsafe math for graphics shaders

2015-02-18 Thread Grigori Goronzy
Hi, AFAIR not enabling this makes LLVM generate really slow code in some common cases. Maybe this is just a bug in LLVM/R600 triggered by unsafe FP math optimization or some optimization is too eager. Other drivers do fine with these types of optimization. What's the impact on performance with

Re: [Mesa-dev] [PATCH] Revert radeon/llvm: enable unsafe math for graphics shaders

2015-02-18 Thread Grigori Goronzy
Am 2015-02-18 09:13, schrieb Michel Dänzer: On 18.02.2015 16:52, Grigori Goronzy wrote: Hi, AFAIR not enabling this makes LLVM generate really slow code in some common cases. Maybe this is just a bug in LLVM/R600 triggered by unsafe FP math optimization or some optimization is too eager

Re: [Mesa-dev] [PATCH] radeonsi: Disable asynchronous DMA except for PIPE_BUFFER

2014-11-14 Thread Grigori Goronzy
Reviewed-by: Grigori Goronzy g...@chown.ath.cx I've been using a similar patch to fix stability issues on my machine for quite a while. Still, it's a pity we have to go that far to get everything stable again. On 13.11.2014 07:52, Michel Dänzer wrote: From: Michel Dänzer michel.daen...@amd.com

Re: [Mesa-dev] [PATCH 3/4] radeonsi: Catch more cases that can't be handled by si_dma_copy_buffer/tile

2014-10-01 Thread Grigori Goronzy
On 30.09.2014 05:58, Michel Dänzer wrote: diff --git a/src/gallium/drivers/radeonsi/si_dma.c b/src/gallium/drivers/radeonsi/si_dma.c index ff64722..643ce3f 100644 --- a/src/gallium/drivers/radeonsi/si_dma.c +++ b/src/gallium/drivers/radeonsi/si_dma.c @@ -251,7 +251,9 @@ void

Re: [Mesa-dev] [PATCH] radeonsi: Simplify si_dma_copy_tile function

2014-09-10 Thread Grigori Goronzy
LGTM, but I have a comments below. Grigori On 10.09.2014 10:54, Michel Dänzer wrote: From: Michel Dänzer michel.daen...@amd.com Signed-off-by: Michel Dänzer michel.daen...@amd.com --- This might help for investigating DMA related bugs. src/gallium/drivers/radeonsi/si_dma.c | 103

Re: [Mesa-dev] [PATCH] r600g, radeonsi: add debug option which forces DMA for copy_region and blit

2014-09-08 Thread Grigori Goronzy
On 08.09.2014 14:50, Axel Davy wrote: Hi, When reading si_dma.c code, it looks like the requested width of the copy is ignored except for PIPE_BUFFER. Perhaps that explains the bugs observed ? It isn't ignored. Partial DMA copies (i.e. operations that do not copy whole lines) are simply

Re: [Mesa-dev] [PATCH] r600g, radeonsi: add debug option which forces DMA for copy_region and blit

2014-09-08 Thread Grigori Goronzy
On 08.09.2014 21:07, Axel Davy wrote: On 08/09/2014 20:21, Grigori Goronzy wrote : On 08.09.2014 14:50, Axel Davy wrote: Hi, When reading si_dma.c code, it looks like the requested width of the copy is ignored except for PIPE_BUFFER. Perhaps that explains the bugs observed ? It isn't

Re: [Mesa-dev] [PATCH 2/3] radeonsi: add sampling of 4:2:2 subsampled textures

2014-08-29 Thread Grigori Goronzy
On 29.08.2014 10:19, Christian König wrote: That sounds like something doesn't work correctly. The resources are created with the subsamled formats R8G8_R8B8 or G8R8_B8R8, but since this can't be accessed by the CB we need to use R8G8B8A8 as surface format for writing to them. If that

Re: [Mesa-dev] [PATCH 2/3] radeonsi: add sampling of 4:2:2 subsampled textures

2014-08-29 Thread Grigori Goronzy
On 29.08.2014 12:31, Andy Furniss wrote: As for that 4:2:2 doesn't work, AFAICT it absolutely does, but there is no linear interpolation for chroma, so quality isn't ideal. This seems to be a hardware restriction, unfortunately. Hmm, we may have to disagree on the definition of working here

Re: [Mesa-dev] [PATCH 2/3] radeonsi: add sampling of 4:2:2 subsampled textures

2014-08-28 Thread Grigori Goronzy
On 04.07.2014 01:24, Andy Furniss wrote: Maybe not 1/frame but anyway the first couple of a run have numbers rather than s [27977.386795] radeon :01:00.0: GPU fault detected: 146 0x0c035014 [27977.386800] radeon :01:00.0: VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x15E0

[Mesa-dev] [PATCH] radeonsi: implement BPTC texture support

2014-08-12 Thread Grigori Goronzy
Passes all piglit tests. v2: rebased --- src/gallium/drivers/radeonsi/si_state.c | 20 1 file changed, 20 insertions(+) diff --git a/src/gallium/drivers/radeonsi/si_state.c b/src/gallium/drivers/radeonsi/si_state.c index 6e9a60a..4f7adea 100644 ---

[Mesa-dev] [PATCH] radeonsi: implement BPTC texture support

2014-07-23 Thread Grigori Goronzy
Passes corrected piglit test and should also handle signed vs unsigned float correctly. --- src/gallium/drivers/radeonsi/si_state.c | 20 1 file changed, 20 insertions(+) diff --git a/src/gallium/drivers/radeonsi/si_state.c b/src/gallium/drivers/radeonsi/si_state.c index

Re: [Mesa-dev] [PATCH 1/2] radeon/llvm: enable unsafe math for graphics shaders

2014-07-21 Thread Grigori Goronzy
On 17.07.2014 21:24, Tom Stellard wrote: On Thu, Jul 17, 2014 at 06:44:25PM +0200, Grigori Goronzy wrote: Accuracy of some operations was recently improved in the R600 backend, at the cost of slower code. This is required for compute shaders, but not for graphics shaders. Add unsafe-fp-math

Re: [Mesa-dev] [PATCH 4/5] r600g, radeonsi: Use write-combined persistent GTT mappings

2014-07-18 Thread Grigori Goronzy
On 18.07.2014 13:45, Marek Olšák wrote: If the requirements of GL_MAP_COHERENT_BIT are satisfied, then the patch is okay. Apart from correctness, I still wonder how this will affect performance, most notably CPU reads. This change unconditionally uses write-combined, uncached memory for

Re: [Mesa-dev] [PATCH 4/5] r600g, radeonsi: Use write-combined persistent GTT mappings

2014-07-17 Thread Grigori Goronzy
On 17.07.2014 12:01, Michel Dänzer wrote: From: Michel Dänzer michel.daen...@amd.com This is hopefully safe: The kernel makes sure writes to these mappings finish before the GPU might start reading from them, and the GPU caches are invalidated at the start of a command stream. Aren't CPU

[Mesa-dev] [PATCH 1/2] radeon/llvm: enable unsafe math for graphics shaders

2014-07-17 Thread Grigori Goronzy
Accuracy of some operations was recently improved in the R600 backend, at the cost of slower code. This is required for compute shaders, but not for graphics shaders. Add unsafe-fp-math hint to make LLVM generate faster but possibly less accurate code. Piglit didn't indicate any regressions. ---

[Mesa-dev] [PATCH 2/2] radeon/llvm: fix formatting

2014-07-17 Thread Grigori Goronzy
Use KR and same indent as most other code. No functional change intended. --- src/gallium/drivers/radeon/radeon_llvm_emit.c | 24 ++-- 1 file changed, 14 insertions(+), 10 deletions(-) diff --git a/src/gallium/drivers/radeon/radeon_llvm_emit.c

Re: [Mesa-dev] [PATCH 2/3] radeonsi: add sampling of 4:2:2 subsampled textures

2014-07-02 Thread Grigori Goronzy
On 02.07.2014 22:18, Andy Furniss wrote: Before I knew how to get field sync to use my TVs deinterlacer I had to modify mesa so that I could use the vdpau de-interlacer(s), when I did this I noticed that 422 didn't work and looked the same as it does now this has gone in with my si. Are

Re: [Mesa-dev] [PATCH 2/3] radeonsi: add sampling of 4:2:2 subsampled textures

2014-06-18 Thread Grigori Goronzy
Olšák marek.ol...@amd.com Marek On Wed, Jun 4, 2014 at 6:54 PM, Grigori Goronzy g...@chown.ath.cx wrote: This makes 4:2:2 video surfaces work in VDPAU. --- src/gallium/drivers/radeon/r600_texture.c | 5 +- src/gallium/drivers/radeonsi/si_blit.c| 91

Re: [Mesa-dev] [PATCH 2/3] radeonsi: add sampling of 4:2:2 subsampled textures

2014-06-17 Thread Grigori Goronzy
Ping? I'm not sure if this is completely correct, but this code path is only excercised by VDPAU and it seems to work fine on SI. Grigori On 04.06.2014 18:54, Grigori Goronzy wrote: This makes 4:2:2 video surfaces work in VDPAU. --- src/gallium/drivers/radeon/r600_texture.c | 5 +- src

[Mesa-dev] [PATCH 1/3] util/u_format: move utility function from r600g

2014-06-04 Thread Grigori Goronzy
We need this for radeonsi, and it might be useful for other drivers, too. --- src/gallium/auxiliary/util/u_format.c | 11 +++ src/gallium/auxiliary/util/u_format.h | 3 +++ src/gallium/drivers/r600/r600_blit.c | 12 +--- 3 files changed, 15 insertions(+), 11 deletions(-) diff

[Mesa-dev] [PATCH 2/3] radeonsi: add sampling of 4:2:2 subsampled textures

2014-06-04 Thread Grigori Goronzy
This makes 4:2:2 video surfaces work in VDPAU. --- src/gallium/drivers/radeon/r600_texture.c | 5 +- src/gallium/drivers/radeonsi/si_blit.c| 91 ++- src/gallium/drivers/radeonsi/si_state.c | 15 + 3 files changed, 71 insertions(+), 40 deletions(-) diff

[Mesa-dev] [PATCH 3/3] radeon/uvd: disable VC-1 simple/main on UVD 2.x

2014-06-04 Thread Grigori Goronzy
It's about as broken as on later UVD revisions. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66452 Cc: 10.1 10.2 mesa-sta...@lists.freedesktop.org --- src/gallium/drivers/radeon/radeon_video.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git

Re: [Mesa-dev] The way r600g handles shaders that use more than available GPRs

2014-04-20 Thread Grigori Goronzy
On 20.04.2014 03:02, Marek Olšák wrote: It looks like the check is not needed with SB, because SB performs register allocation. What happens if you comment out the conditional which fails? SB takes the machine code generated by the classic compiler as input, so the check is still needed. The

Re: [Mesa-dev] [RFC] r600g/radeonsi: Use caching buffer manager for textures as well

2014-04-10 Thread Grigori Goronzy
On 10.04.2014 11:23, Michel Dänzer wrote: From: Michel Dänzer michel.daen...@amd.com --- This is just an RFC; if other developers approve of this approach, I can make a more extensive patch removing the use_reusable_pool parameters. The x11perf numbers below compare ShmGet/PutImage before and

[Mesa-dev] [PATCH 1/2] st/vdpau: fix possible NULL dereference

2014-03-02 Thread Grigori Goronzy
--- src/gallium/state_trackers/vdpau/mixer.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/src/gallium/state_trackers/vdpau/mixer.c b/src/gallium/state_trackers/vdpau/mixer.c index 996fd8e..e6bfb8c 100644 --- a/src/gallium/state_trackers/vdpau/mixer.c +++

[Mesa-dev] [PATCH 2/2] NV_vdpau_interop: fix IsSurfaceNV return type

2014-03-02 Thread Grigori Goronzy
The spec incorrectly used void as return type, when it should have been GLboolean. This has now been fixed. According to Nvidia, their implementation always used GLboolean. --- include/GL/glext.h | 2 +- src/mapi/glapi/gen/NV_vdpau_interop.xml | 1 + src/mesa/main/vdpau.c

  1   2   >