On 2017-08-03 22:26, Alex Deucher wrote:
IIRC, user_ptrs require page alignment.
Alex
I didn't follow the whole discussion (sorry if I'm saying something
redundant), but AMD's older OpenCL Optimization Guide [1] has some notes
regarding the implementation of the USE_HOST_PTR flag.
It
Hi,
there also is a patch needed to make this work for Xorg on the
xorg-devel list as well as preliminary piglit test to verify the
functionality on the piglit list.
Grigori
On 2017-08-03 20:07, Grigori Goronzy wrote:
---
src/glx/dri2_glx.c | 12
src/glx/dri3_glx.c
---
src/gallium/state_trackers/glx/xlib/glx_api.c | 55 ---
src/gallium/state_trackers/glx/xlib/xm_api.c | 6 ++-
src/gallium/state_trackers/glx/xlib/xm_api.h | 4 +-
3 files changed, 57 insertions(+), 8 deletions(-)
diff --git
---
src/glx/dri2_glx.c | 12
src/glx/dri3_glx.c | 8
src/glx/dri_common.c| 52 -
src/glx/dri_common.h| 5 +
src/glx/drisw_glx.c | 3 +++
src/glx/glxclient.h | 6 ++
src/glx/glxextensions.c
On 2017-07-19 23:51, Grigori Goronzy wrote:
The check is too aggressive and might also fail if context flags
appear after the no-error attribute in the context attribute list.
Delay the check to after attribute parsing to fix this.
---
This was found by the piglit test I just sent to the piglit
On 2017-07-18 20:25, Ian Romanick wrote:
On 07/14/2017 04:10 PM, Kenneth Graunke wrote:
Grigori recently added EGL_KHR_create_context_no_error support,
which causes EGL to pass a new __DRI_CTX_FLAG_NO_ERROR flag to
drivers when requesting an appropriate context mode.
driContextSetFlags() will
The check is too aggressive and might also fail if context flags
appear after the no-error attribute in the context attribute list.
Delay the check to after attribute parsing to fix this.
---
This was found by the piglit test I just sent to the piglit ML. I promise,
next time I'll write tests
On 2017-07-18 20:25, Ian Romanick wrote:
On 07/14/2017 04:10 PM, Kenneth Graunke wrote:
Grigori recently added EGL_KHR_create_context_no_error support,
which causes EGL to pass a new __DRI_CTX_FLAG_NO_ERROR flag to
drivers when requesting an appropriate context mode.
driContextSetFlags() will
On 2017-07-17 19:21, Emil Velikov wrote:
On 13 July 2017 at 12:09, Grigori Goronzy <g...@chown.ath.cx> wrote:
On 2017-07-12 15:15, Emil Velikov wrote:
As mentioned in earlier commit no_error should be device agnostic.
Hence removing the st/dri bits and adding a DRI_CONF_MESA_NO_ERROR(
, but the
classic drivers all have code to explicitly balk at unknown flags. We
need to let it through or they'll fail to create a no_error context.
I can't test it, but LGTM, so:
Reviewed-by: Grigori Goronzy <g...@chown.ath.cx>
---
src/mesa/drivers/dri/i915/intel_screen.c | 2 +-
src/mesa/d
On 2017-07-14 23:30, Kenneth Graunke wrote:
This accidentally set __DRI_CTX_FLAG_NO_ERROR whenever any flags were
present. Just needs extra parenthesis.
Fixes: 4909519a6655 (egl: Add EGL_KHR_create_context_no_error support)
Reviewed-by: Grigori Goronzy <g...@chown.ath.cx>
Sorry for br
This was broken by commit 1ad24faa.
---
src/mesa/main/marshal.h | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/src/mesa/main/marshal.h b/src/mesa/main/marshal.h
index f2dc842..63e0295 100644
--- a/src/mesa/main/marshal.h
+++ b/src/mesa/main/marshal.h
@@ -257,7 +257,7 @@
This basic extension allows usage of the __DRI_CTX_FLAG_NO_ERROR flag.
This includes support code for classic Mesa drivers to switch on the
no-error mode if the flag is set.
v2: Move to common DRI code.
---
include/GL/internal/dri_interface.h | 19 +++
This only adds the EGL side, needs to be plumbed into Mesa frontend.
v2: Add check for extension availability.
---
src/egl/drivers/dri2/egl_dri2.c | 20 ++--
src/egl/drivers/dri2/egl_dri2.h | 1 +
src/egl/main/eglapi.c | 1 +
src/egl/main/eglcontext.c | 31
Add a new context flag and plumb it through the various layers of the
context creation code to set up dispatch tables for the no-error mode.
---
src/gallium/include/state_tracker/st_api.h | 1 +
src/gallium/state_trackers/dri/dri_context.c | 3 +++
src/mesa/state_tracker/st_context.c
Allows applications to be whitelisted.
v2: Remove misguided DRI common part.
---
src/gallium/state_trackers/dri/dri_context.c| 3 +++
src/gallium/state_trackers/dri/dri_screen.c | 1 +
src/mesa/drivers/dri/common/xmlpool/t_options.h | 5 +
3 files changed, 9 insertions(+)
diff --git
On 2017-07-12 15:15, Emil Velikov wrote:
As mentioned in earlier commit no_error should be device agnostic.
Hence removing the st/dri bits and adding a DRI_CONF_MESA_NO_ERROR()
line next to DRI_CONF_VBLANK_MODE seems like the better solution.
Hm, driconf overrides are typically set per screen
On 2017-07-12 15:08, Emil Velikov wrote:
On 11 July 2017 at 23:26, Grigori Goronzy <g...@chown.ath.cx> wrote:
Add a new context flag and plumb it through the various layers of the
context creation code to set up dispatch tables for the no-error mode.
---
src/gallium/include/state_t
On 2017-07-12 15:16, Emil Velikov wrote:
On 11 July 2017 at 23:26, Grigori Goronzy <g...@chown.ath.cx> wrote:
Hi,
this series implements support for the EGL_KHR_context_create_no
error extension and the associated plumbing through the different
layers of Mesa - EGL, DRI, Gallium state t
On 2017-07-12 12:33, Eric Engestrom wrote:
+ case EGL_CONTEXT_OPENGL_NO_ERROR_KHR:
+ if (dpy->Version < 14) {
+err = EGL_BAD_ATTRIBUTE;
+break;
+ }
+
+ /* The KHR_no_error spec only applies against OpenGL 2.0+
and
+ * OpenGL ES 2.0+
This basic extension allows usage of the __DRI_CTX_FLAG_NO_ERROR flag.
This includes support code for classic Mesa drivers to switch on the
no-error mode if the flag is set.
---
include/GL/internal/dri_interface.h | 19 +++
src/gallium/state_trackers/dri/dri2.c|
This only adds the EGL side, needs to be plumbed into Mesa frontend.
---
src/egl/drivers/dri2/egl_dri2.c | 20 ++--
src/egl/drivers/dri2/egl_dri2.h | 1 +
src/egl/main/eglapi.c | 1 +
src/egl/main/eglcontext.c | 30 ++
Hi,
this series implements support for the EGL_KHR_context_create_no
error extension and the associated plumbing through the different
layers of Mesa - EGL, DRI, Gallium state tracker, Mesa frontend. It
took me a while to figure out how everything is connected together
and still it's somewhat
Add a new context flag and plumb it through the various layers of the
context creation code to set up dispatch tables for the no-error mode.
---
src/gallium/include/state_tracker/st_api.h | 1 +
src/gallium/state_trackers/dri/dri_context.c | 3 +++
src/mesa/state_tracker/st_context.c
Allows applications to be whitelisted.
---
src/gallium/state_trackers/dri/dri_context.c| 3 +++
src/gallium/state_trackers/dri/dri_screen.c | 1 +
src/mesa/drivers/dri/common/dri_util.c | 3 +++
src/mesa/drivers/dri/common/xmlpool/t_options.h | 5 +
4 files changed, 12
The semantics are similar to glBufferData. Fixes a crash with VMWare
Player.
Signed-off-by: Grigori Goronzy <g...@chown.ath.cx>
---
src/mesa/main/marshal.c | 17 +
1 file changed, 13 insertions(+), 4 deletions(-)
diff --git a/src/mesa/main/marshal.c b/src/mesa/main/mar
On 2017-06-26 15:51, Marc Dietrich wrote:
Am Montag, 26. Juni 2017, 15:35:15 CEST schrieb Grigori Goronzy:
On 2017-06-26 15:11, Marc Dietrich wrote:
> unfortunately, this change broke vmware/vmplayer here (bisected).
> Windows
> guest on linux host. Sig 11 in SVGA driver.
On 2017-07-09 18:52, Matt Turner wrote:
+static inline size_t buffer_to_size(GLenum buffer)
+{
+ switch (buffer) {
+ case GL_COLOR:
+ return 4;
+ case GL_DEPTH_STENCIL:
+ return 2;
+ case GL_STENCIL:
+ case GL_DEPTH:
+ return 1;
+ default:
+ return 0;
+ }
+}
+
Add async marshalling/unmarshalling for all glClearBuffer variants.
These entry points are commonly used in general and Alien Isolation
specifically uses glClearBufferiv. Slightly reduces the number of
thread synchronizations with glthread in that game.
---
src/mapi/glapi/gen/GL3x.xml | 6 +-
Extract clear buffer helper functions in preparation for adding
marshal/unmarshal functions for the various glClearBuffer variants.
---
src/mesa/main/marshal.c | 74 +++--
src/mesa/main/marshal.h | 5 ++--
2 files changed, 50 insertions(+), 29
the switch/case block into an
efficient jump table with the ID method, so an array for function lookup
instead of that doesn't improve anything.
I didn't see any measurable benefit of the function pointer method
either.
Best regards
Grigori
On Fri, Jun 30, 2017 at 7:14 PM, Grigori Goronzy
On 2017-06-30 15:27, Nicolai Hähnle wrote:
On 30.06.2017 02:29, Grigori Goronzy wrote:
Use function pointers to identify the unmarshalling function, which
is simpler and gets rid of a lot generated code.
This removes an indirection and possibly results in a slight speedup
as well.
The fact
Use function pointers to identify the unmarshalling function, which
is simpler and gets rid of a lot generated code.
This removes an indirection and possibly results in a slight speedup
as well.
---
src/mapi/glapi/gen/Makefile.am | 4 --
src/mapi/glapi/gen/gl_marshal.py | 36
don't really get it, by the way. Isn't the SVGA driver for Linux
guests?
Best regards
Grigori
> Best regards
> Grigori
>
>> [1]
>> https://lists.freedesktop.org/archives/mesa-dev/2017-June/160329.html
>>
>> On 25/06/17 02:59, Grigori Goronzy wrote:
>>
On 2017-06-22 17:10, Marek Olšák wrote:
From: Marek Olšák
+2.3% better score on Fiji. It might be better without HBM.
Is this really useful? Superposition is a benchmark. It would make more
sense if this also targeted some actual games.
Optimizations specific to only
surprise me if it is in the
40-50% region with both, though.
Best regards
Grigori
[1]
https://lists.freedesktop.org/archives/mesa-dev/2017-June/160329.html
On 25/06/17 02:59, Grigori Goronzy wrote:
These entry points are used by Alien Isolation and caused
synchronization with glthread
These entry points are used by Alien Isolation and caused
synchronization with glthread. The async marshalling implementation
is similar to glBuffer(Sub)Data.
Results in an approximately 6x drop in glthread synchronizations and a
~30% FPS jump in Alien Isolation (Medium preset, Athlon 860K, RX
On 2017-06-23 13:48, Andy Furniss wrote:
Marek Olšák wrote:
From: Marek Olšák
The kernel sort of does the same thing with fences.
v2: do emit partial flushes on SI
Bugzilla seems to be down currently so replying here.
On R9 285 with current agd5f 4.13-wip kernel I get
e a better compromise, particularly for systems with a
slow CPU.
Apart from that, consider the series
Reviewed-by: Grigori Goronzy <g...@chown.ath.cx>
Best regards
Grigori
Am Donnerstag, 2. März 2017, 03:20:05 CET schrieb Matt Turner:
On Wed, Mar 1, 2017 at 2:19 PM, Timothy Arceri
<ta
On 2016-10-04 12:32, Emil Velikov wrote:
On 2 October 2016 at 14:17, Axel Davy wrote:
I'd prefer myself Oct 14, because we have a lot of patches for nine,
and
they deserve more cleaning and testing, but if it's Oct 7, we'll try
be on
time.
14th it is. As mentioned before:
---
src/amd/vulkan/radv_descriptor_set.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/src/amd/vulkan/radv_descriptor_set.c
b/src/amd/vulkan/radv_descriptor_set.c
index d1d2b1f..ba8a002 100644
--- a/src/amd/vulkan/radv_descriptor_set.c
+++ b/src/amd/vulkan/radv_descriptor_set.c
@@ -113,6
---
src/amd/vulkan/radv_pipeline_cache.c | 7 +--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/src/amd/vulkan/radv_pipeline_cache.c
b/src/amd/vulkan/radv_pipeline_cache.c
index 032a7e4..85a2b6d 100644
--- a/src/amd/vulkan/radv_pipeline_cache.c
+++
This gets rid of "may be used uninitialized" compiler warnings.
---
src/amd/vulkan/radv_formats.c | 2 +-
src/amd/vulkan/radv_pipeline.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/src/amd/vulkan/radv_formats.c b/src/amd/vulkan/radv_formats.c
index 90c140c..76d5fa1
On 2016-06-28 11:25, Nayan Deshmukh wrote:
This is a shader based bicubic interpolater which uses cubic
Hermite spline algorithm.
v2: set dst_area and dst_clip during scaling (Christian)
v3: clear the render target before rendering
v4: intialize offsets while initializing shaders
use a
On 2016-05-27 15:16, Emil Velikov wrote:
The odd things is that VLC uses/used to? check that information before
feeding the video to the decoder, while others implementations (like
the original one in mplayer done by the Nvidia devs) do/did? not
bother.
Many files either have an incorrect
any calls into the kernel, right? The
winsys code makes that conditional and calls into the kernel when no
fence pointer is available.
Grigori
On 19.04.2016 18:13, Grigori Goronzy wrote:
Small IBs help to reduce stalls for workloads that require a lot of
synchronization. On the other hand
Add missing break, add default case. Additionally initialize variables
to avoid compiler warnings.
---
src/gallium/winsys/amdgpu/drm/amdgpu_cs.c | 6 +-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/src/gallium/winsys/amdgpu/drm/amdgpu_cs.c
Small IBs help to reduce stalls for workloads that require a lot of
synchronization. On the other hand, if there is no notable
synchronization, we can use a large IB size to slightly improve
performance in some cases.
This introduces tuning of the IB size based on feedback on the average
buffer
Interesting, and thanks for poking at this issue. I've been thinking
about tuning IB sizes as well. I'd like for us to get this right, so I
wonder: What's your theory for _why_ your change helps?
See below. I think you discovered it yourself.
I'll be honest with you: Right now, I think your
On 2016-04-15 20:30, Jakob Sinclair wrote:
In other places in radeonsi that require reinterpretation (e.g.
si_blit.c), the surface template is modified instead of changing the
surface after creation. I'm not sure if r600/radeonsi like it if the
format is changed late like here. Seems to be
Hi,
apps that cause a lot of synchronization benefit from small IB
sizes. The current IB size is a bit on the large side for this class
of apps. On the other hand, if there isn't much synchronization going
on, increasing the IB size can slightly improve performance, too.
Here's a quick hack that
Small IBs help to reduce stalls for workloads that require a lot of
synchronization. On the other hand, if there is no notable
synchronization, we can use a large IB size to slightly improve
performance in some cases.
This introduces tuning of the IB size based on feedback on the average
buffer
On 2016-04-15 18:38, Ilia Mirkin wrote:
+ } else {
+ union pipe_color_union color;
+ switch (util_format_get_blocksizebits(res->format)) {
+ case 128:
+ sf->format = PIPE_FORMAT_R32G32B32A32_UINT;
Just as an FYI... this is
issor_enable;
/* The simple case: Only 1 viewport is active. */
- if (mask & 1 &&
- !si_get_vs_info(sctx)->writes_viewport_index) {
+ if (!si_get_vs_info(sctx)->writes_viewport_index) {
+ if (!(mask & 1))
+ return;
+
ssor & viewport code is deleted.
Thanks for implementing this properly.
Reviewed-by: Grigori Goronzy <g...@chown.ath.cx>
Grigori
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev
With the previous changes to handling of viewport clipping, it is
almost trivial to add proper support for guard band clipping. Select a
suitable integer clipping value to keep inside the rasterizer's guard
band range of [-32768, 32767] and program the hardware to use guard
band clipping.
Guard
With the previous changes to handling of viewport clipping, it is
almost trivial to add proper support for guard band clipping. Select a
suitable integer clipping value to keep inside the rasterizer's guard
band range of [-32768, 32767] and program the hardware to use guard
band clipping.
Guard
From: Marek Olšák
In other words, vport scissors are derived from viewport states.
If the scissor test is enabled, the intersection of both is used.
The guard band will disable clipping, so we have to clip per-pixel.
v2: fix check for r600_draw_rectangle and other overflow
On 2016-02-23 17:45, Marek Olšák wrote:
From: Marek Olšák
This can increase perf for shaders that kill pixels (kill, alpha-test,
alpha-to-coverage).
---
src/gallium/drivers/radeonsi/si_shader.h| 1 +
src/gallium/drivers/radeonsi/si_state.c | 6 +++---
On 2016-02-24 12:47, Marek Olšák wrote:
On Wed, Feb 24, 2016 at 12:22 PM, Grigori Goronzy <g...@chown.ath.cx>
wrote:
S_00B32C_SCRATCH_EN(shader->config.scratch_bytes_per_wave > 0));
+
+ /* Prefer RE_Z if the shader is complex enough. */
+ if (info->num_memory_in
Hi,
On 23.09.2015 10:11, Christian König wrote:
> From: Boyuan Zhang
>
> Signed-off-by: Boyuan Zhang
> Reviewed-by: Christian König
> ---
Thanks, nice to see this finally getting fixed, and it was a pretty
simple thing
On 2015-06-09 22:52, Francisco Jerez wrote:
+
+ if (blocking)
+ hev().wait();
+
hard_event::wait() may fail, so this should probably be done before the
ret_object() call to avoid leaks.
Alright... C++ exceptions are a minefield. :)
Is there any reason you didn't make
the same change
On 2015-05-28 13:04, Grigori Goronzy wrote:
Work-group size should always be aligned to subgroup size; this is a
basic requirement, otherwise some work-items will be no-operation.
It might make sense to refine the value according to a kernel's
resource usage, but that's a possible optimization
On 28.05.2015 13:04, Grigori Goronzy wrote:
We need this to implement OpenCL's
CL_KERNEL_PREFERRED_WORK_GROUP_SIZE_MULTIPLE.
---
Ping?
src/gallium/docs/source/screen.rst | 2 ++
src/gallium/drivers/ilo/ilo_screen.c | 8
src/gallium/drivers/nouveau/nvc0
On 28.05.2015 10:10, Grigori Goronzy wrote:
Wrap MapBuffer and MapImage as hard_event actions, like other
operations. This enables correct profiling. Also make sure to wait
for events to finish when blocking is requested by the caller.
---
Ping?
src/gallium/state_trackers/clover/api
Wrap MapBuffer and MapImage as hard_event actions, like other
operations. This enables correct profiling. Also make sure to wait
for events to finish when blocking is requested by the caller.
---
src/gallium/state_trackers/clover/api/transfer.cpp | 50 --
1 file changed, 46
Mapping can fail, and this should be handled. Return the proper error
code and abort the associated event in this case.
---
src/gallium/state_trackers/clover/api/transfer.cpp | 16 ++--
1 file changed, 14 insertions(+), 2 deletions(-)
diff --git
We need this to implement OpenCL's
CL_KERNEL_PREFERRED_WORK_GROUP_SIZE_MULTIPLE.
---
src/gallium/docs/source/screen.rst | 2 ++
src/gallium/drivers/ilo/ilo_screen.c | 8
src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 4
Work-group size should always be aligned to subgroup size; this is a
basic requirement, otherwise some work-items will be no-operation.
It might make sense to refine the value according to a kernel's
resource usage, but that's a possible optimization for the future.
---
the same issues as SI? We should really
try to figure out what's wrong with tiled DMA copies.
Anyway,
Reviewed-by: Grigori Goronzy g...@chown.ath.cx
Signed-off-by: Michel Dänzer michel.daen...@amd.com
---
src/gallium/drivers/radeonsi/Makefile.sources | 1 +
src/gallium/drivers/radeonsi
On 23.05.2015 15:53, Francisco Jerez wrote:
diff --git a/src/gallium/state_trackers/clover/core/resource.cpp
b/src/gallium/state_trackers/clover/core/resource.cpp
index 8ed4c42..8e51b3c 100644
--- a/src/gallium/state_trackers/clover/core/resource.cpp
+++
This flag is typically used to request pinned host memory, to avoid
any copies between GPU and CPU.
This improves throughput with an older OpenCL app which I unfortunately
can't publish due to its licensing.
---
src/gallium/state_trackers/clover/core/resource.cpp | 4
1 file changed, 4
According to spec, CL_MEM_USE_HOST_PTR should directly use host memory,
if possible. This is just what userptr is for, so use it.
In case the memory cannot be mapped, a fallback similar to
CL_MEM_COPY_HOST_PTR is used.
---
src/gallium/state_trackers/clover/core/memory.cpp | 2 +-
Hi,
AFAIR not enabling this makes LLVM generate really slow code in some
common cases. Maybe this is just a bug in LLVM/R600 triggered by unsafe
FP math optimization or some optimization is too eager. Other drivers do
fine with these types of optimization.
What's the impact on performance with
Am 2015-02-18 09:13, schrieb Michel Dänzer:
On 18.02.2015 16:52, Grigori Goronzy wrote:
Hi,
AFAIR not enabling this makes LLVM generate really slow code in some
common cases. Maybe this is just a bug in LLVM/R600 triggered by
unsafe
FP math optimization or some optimization is too eager
Reviewed-by: Grigori Goronzy g...@chown.ath.cx
I've been using a similar patch to fix stability issues on my machine
for quite a while. Still, it's a pity we have to go that far to get
everything stable again.
On 13.11.2014 07:52, Michel Dänzer wrote:
From: Michel Dänzer michel.daen...@amd.com
On 30.09.2014 05:58, Michel Dänzer wrote:
diff --git a/src/gallium/drivers/radeonsi/si_dma.c
b/src/gallium/drivers/radeonsi/si_dma.c
index ff64722..643ce3f 100644
--- a/src/gallium/drivers/radeonsi/si_dma.c
+++ b/src/gallium/drivers/radeonsi/si_dma.c
@@ -251,7 +251,9 @@ void
LGTM, but I have a comments below.
Grigori
On 10.09.2014 10:54, Michel Dänzer wrote:
From: Michel Dänzer michel.daen...@amd.com
Signed-off-by: Michel Dänzer michel.daen...@amd.com
---
This might help for investigating DMA related bugs.
src/gallium/drivers/radeonsi/si_dma.c | 103
On 08.09.2014 14:50, Axel Davy wrote:
Hi,
When reading si_dma.c code, it looks like the requested width of the
copy is ignored except for PIPE_BUFFER.
Perhaps that explains the bugs observed ?
It isn't ignored. Partial DMA copies (i.e. operations that do not copy
whole lines) are simply
On 08.09.2014 21:07, Axel Davy wrote:
On 08/09/2014 20:21, Grigori Goronzy wrote :
On 08.09.2014 14:50, Axel Davy wrote:
Hi,
When reading si_dma.c code, it looks like the requested width of the
copy is ignored except for PIPE_BUFFER.
Perhaps that explains the bugs observed ?
It isn't
On 29.08.2014 10:19, Christian König wrote:
That sounds like something doesn't work correctly.
The resources are created with the subsamled formats R8G8_R8B8 or
G8R8_B8R8, but since this can't be accessed by the CB we need to use
R8G8B8A8 as surface format for writing to them.
If that
On 29.08.2014 12:31, Andy Furniss wrote:
As for that 4:2:2 doesn't work, AFAICT it absolutely does, but
there is no linear interpolation for chroma, so quality isn't ideal.
This seems to be a hardware restriction, unfortunately.
Hmm, we may have to disagree on the definition of working here
On 04.07.2014 01:24, Andy Furniss wrote:
Maybe not 1/frame but anyway the first couple of a run have numbers
rather than s
[27977.386795] radeon :01:00.0: GPU fault detected: 146 0x0c035014
[27977.386800] radeon :01:00.0: VM_CONTEXT1_PROTECTION_FAULT_ADDR
0x15E0
Passes all piglit tests.
v2: rebased
---
src/gallium/drivers/radeonsi/si_state.c | 20
1 file changed, 20 insertions(+)
diff --git a/src/gallium/drivers/radeonsi/si_state.c
b/src/gallium/drivers/radeonsi/si_state.c
index 6e9a60a..4f7adea 100644
---
Passes corrected piglit test and should also handle signed vs unsigned
float correctly.
---
src/gallium/drivers/radeonsi/si_state.c | 20
1 file changed, 20 insertions(+)
diff --git a/src/gallium/drivers/radeonsi/si_state.c
b/src/gallium/drivers/radeonsi/si_state.c
index
On 17.07.2014 21:24, Tom Stellard wrote:
On Thu, Jul 17, 2014 at 06:44:25PM +0200, Grigori Goronzy wrote:
Accuracy of some operations was recently improved in the R600 backend,
at the cost of slower code. This is required for compute shaders,
but not for graphics shaders. Add unsafe-fp-math
On 18.07.2014 13:45, Marek Olšák wrote:
If the requirements of GL_MAP_COHERENT_BIT are satisfied, then the
patch is okay.
Apart from correctness, I still wonder how this will affect performance,
most notably CPU reads. This change unconditionally uses write-combined,
uncached memory for
On 17.07.2014 12:01, Michel Dänzer wrote:
From: Michel Dänzer michel.daen...@amd.com
This is hopefully safe: The kernel makes sure writes to these mappings
finish before the GPU might start reading from them, and the GPU caches
are invalidated at the start of a command stream.
Aren't CPU
Accuracy of some operations was recently improved in the R600 backend,
at the cost of slower code. This is required for compute shaders,
but not for graphics shaders. Add unsafe-fp-math hint to make LLVM
generate faster but possibly less accurate code.
Piglit didn't indicate any regressions.
---
Use KR and same indent as most other code. No functional change
intended.
---
src/gallium/drivers/radeon/radeon_llvm_emit.c | 24 ++--
1 file changed, 14 insertions(+), 10 deletions(-)
diff --git a/src/gallium/drivers/radeon/radeon_llvm_emit.c
On 02.07.2014 22:18, Andy Furniss wrote:
Before I knew how to get field sync to use my TVs deinterlacer I had to
modify mesa so that I could use the vdpau de-interlacer(s), when I did
this I noticed that 422 didn't work and looked the same as it does now
this has gone in with my si.
Are
Olšák marek.ol...@amd.com
Marek
On Wed, Jun 4, 2014 at 6:54 PM, Grigori Goronzy g...@chown.ath.cx
wrote:
This makes 4:2:2 video surfaces work in VDPAU.
---
src/gallium/drivers/radeon/r600_texture.c | 5 +-
src/gallium/drivers/radeonsi/si_blit.c| 91
Ping? I'm not sure if this is completely correct, but this code path is
only excercised by VDPAU and it seems to work fine on SI.
Grigori
On 04.06.2014 18:54, Grigori Goronzy wrote:
This makes 4:2:2 video surfaces work in VDPAU.
---
src/gallium/drivers/radeon/r600_texture.c | 5 +-
src
We need this for radeonsi, and it might be useful for other drivers,
too.
---
src/gallium/auxiliary/util/u_format.c | 11 +++
src/gallium/auxiliary/util/u_format.h | 3 +++
src/gallium/drivers/r600/r600_blit.c | 12 +---
3 files changed, 15 insertions(+), 11 deletions(-)
diff
This makes 4:2:2 video surfaces work in VDPAU.
---
src/gallium/drivers/radeon/r600_texture.c | 5 +-
src/gallium/drivers/radeonsi/si_blit.c| 91 ++-
src/gallium/drivers/radeonsi/si_state.c | 15 +
3 files changed, 71 insertions(+), 40 deletions(-)
diff
It's about as broken as on later UVD revisions.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66452
Cc: 10.1 10.2 mesa-sta...@lists.freedesktop.org
---
src/gallium/drivers/radeon/radeon_video.c | 5 -
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git
On 20.04.2014 03:02, Marek Olšák wrote:
It looks like the check is not needed with SB, because SB performs
register allocation. What happens if you comment out the conditional
which fails?
SB takes the machine code generated by the classic compiler as input,
so the check is still needed. The
On 10.04.2014 11:23, Michel Dänzer wrote:
From: Michel Dänzer michel.daen...@amd.com
---
This is just an RFC; if other developers approve of this approach, I can
make a more extensive patch removing the use_reusable_pool parameters.
The x11perf numbers below compare ShmGet/PutImage before and
---
src/gallium/state_trackers/vdpau/mixer.c | 8
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/src/gallium/state_trackers/vdpau/mixer.c
b/src/gallium/state_trackers/vdpau/mixer.c
index 996fd8e..e6bfb8c 100644
--- a/src/gallium/state_trackers/vdpau/mixer.c
+++
The spec incorrectly used void as return type, when it should have
been GLboolean. This has now been fixed. According to Nvidia, their
implementation always used GLboolean.
---
include/GL/glext.h | 2 +-
src/mapi/glapi/gen/NV_vdpau_interop.xml | 1 +
src/mesa/main/vdpau.c
1 - 100 of 160 matches
Mail list logo