[Mesa-dev] [Bug 96953] dri2_wl_swrast crashes on 64 bit, but not on 32 bit

2016-09-26 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=96953 --- Comment #3 from n3rdopolis --- I tried a recent recompile, this still appears to be happening, 64 bit only. 32 bit is fine -- You are receiving this mail because: You are the QA Contact for the bug.__

[Mesa-dev] [PATCH] glsl: remove remaining tabs in glsl_parser_extras.h

2016-09-26 Thread Timothy Arceri
--- src/compiler/glsl/glsl_parser_extras.h | 60 +- 1 file changed, 30 insertions(+), 30 deletions(-) diff --git a/src/compiler/glsl/glsl_parser_extras.h b/src/compiler/glsl/glsl_parser_extras.h index f4050e3..b9c9a1a 100644 --- a/src/compiler/glsl/glsl_parser_ext

Re: [Mesa-dev] [llvm] r282237 - [InstCombine] Fix for PR29124: reduce insertelements to shufflevector

2016-09-26 Thread Michel Dänzer
On 26/09/16 10:28 PM, Alexey Bataev wrote: > Michael, fixed this bug in r282401 I can confirm it's fixed, thanks! -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer

Re: [Mesa-dev] [PATCH] st/va: Fix vaSyncSurface with no outstanding operation

2016-09-26 Thread Mark Thompson
On 27/09/16 00:49, Andy Furniss wrote: > Mark Thompson wrote: >> --- >> A simple fix to the problem described here: >> . >> >> With this applied, the driver no longer hangs/crashes when vaSyncSurface() >> is called in pla

[Mesa-dev] [PATCH V2 10/11] genX/cmd_buffer: Enable fast depth clears

2016-09-26 Thread Nanley Chery
From: Nanley Chery Provides an FPS increase of ~30% on the Sascha triangle and multisampling demos. Clears that happen within a render pass via vkCmdClearAttachments are safe even if the clear color changes. This is because the meta implementation does not use LOAD_OP_CLEAR which avoids any conf

[Mesa-dev] [PATCH V2 04/11] anv: Add func anv_image_has_hiz()

2016-09-26 Thread Nanley Chery
From: Chad Versace Signed-off-by: Nanley Chery Reviewed-by: Jason Ekstrand --- v2. Check aspect instead of usage (Chad, Jason) src/intel/vulkan/anv_private.h | 10 ++ 1 file changed, 10 insertions(+) diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h index

[Mesa-dev] [PATCH V2 07/11] anv/image: Memset hiz surfaces to 0 when binding memory

2016-09-26 Thread Nanley Chery
From: Jason Ekstrand Nanley Chery (amend): - Change memset value from 0xff to 0 (a defined value for HiZ). Signed-off-by: Nanley Chery --- v2. Add asserts (Jason) Handle NULL return value of the mmap src/intel/vulkan/anv_image.c | 31 ++- 1 file changed, 30

[Mesa-dev] [PATCH V2 06/11] anv: Move BindImageMemory to anv_image.c

2016-09-26 Thread Nanley Chery
From: Jason Ekstrand Signed-off-by: Nanley Chery Reviewed-by: Chad Versace Reviewed-by: Jason Ekstrand --- src/intel/vulkan/anv_device.c | 20 src/intel/vulkan/anv_image.c | 20 2 files changed, 20 insertions(+), 20 deletions(-) diff --git a/src/int

[Mesa-dev] [PATCH V2 11/11] anv/TODO: Update the HiZ task

2016-09-26 Thread Nanley Chery
From: Nanley Chery Signed-off-by: Nanley Chery --- v2. Add untested HiZ cases src/intel/vulkan/TODO | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/intel/vulkan/TODO b/src/intel/vulkan/TODO index 8fac370..9ac63eb 100644 --- a/src/intel/vulkan/TODO +++ b/src/intel/vulk

[Mesa-dev] [PATCH V2 02/11] isl: Update isl_surf_get_hiz_surf()

2016-09-26 Thread Nanley Chery
From: Nanley Chery Modify extents and dimensions to match the PRMs more closely. Along with being able to create the correct 3D surface this enables us to avoid working with multisampled compressed textures. Signed-off-by: Nanley Chery Reviewed-by: Chad Versace --- Note: This patch will have

[Mesa-dev] [PATCH V2 09/11] genX/cmd_buffer: Enable rendering to HiZ

2016-09-26 Thread Nanley Chery
From: Chad Versace Nanley Chery: (rebase) - Resolve conflicts with new anv_batch_emit macro (amend) - Handle a QPitch TODO - Emit 3DSTATE_HIER_DEPTH_BUFFER on pre-BDW systems - Only use HiZ for single-subpass renderpasses - Emit the HiZ instruction before the stencil instruction to follow th

[Mesa-dev] [PATCH V2 08/11] anv/cmd_buffer: Add code for performing HZ operations

2016-09-26 Thread Nanley Chery
Create a function that performs one of three HiZ operations - depth/stencil clears, HiZ resolve, and depth resolves. Signed-off-by: Nanley Chery --- v2. Add documentation Fix the alignment check Don't minify clear rectangle (Jason) Use blorp enums (Jason) Enable depth stalls and

[Mesa-dev] [PATCH V2 05/11] anv: Allocate hiz surface

2016-09-26 Thread Nanley Chery
From: Chad Versace Nanley Chery: (rebase) - Use isl_surf_get_hiz_surf() (amend) - Only add a HiZ surface onto a depth/stencil attachment - Add comment above HiZ surface addition - Hide HiZ behind INTEL_VK_HIZ prior to BDW - Disable HiZ for untested cases - Remove DISABLE_AUX_BIT instead of

[Mesa-dev] [PATCH V2 03/11] anv: Add anv_image::hiz_surface

2016-09-26 Thread Nanley Chery
From: Chad Versace Unused. Signed-off-by: Nanley Chery Reviewed-by: Jason Ekstrand --- src/intel/vulkan/anv_private.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h index 443c31f..7e08786 100644 --- a/src/intel/vulkan/anv_

[Mesa-dev] [PATCH V2 00/11] anv: Implement HiZ for basic cases

2016-09-26 Thread Nanley Chery
This series is the second revision of the series found here: https://lists.freedesktop.org/archives/mesa-dev/2016-September/127687.html Comments from the first were addressed and the code was rebased onto the upstream master. Cc: Chad Versace Cc: Jason Ekstrand Chad Versace (4): anv: Add anv

[Mesa-dev] [PATCH V2 01/11] isl: Correct a comment in the isl_format enum

2016-09-26 Thread Nanley Chery
From: Nanley Chery HiZ is not a color surface, but an auxiliary depth surface. Signed-off-by: Nanley Chery Reviewed-by: Chad Versace Reviewed-by: Jason Ekstrand --- src/intel/isl/isl.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/intel/isl/isl.h b/src/intel/isl/isl

Re: [Mesa-dev] [PATCH] st/va: Fix vaSyncSurface with no outstanding operation

2016-09-26 Thread Andy Furniss
Mark Thompson wrote: --- A simple fix to the problem described here: . With this applied, the driver no longer hangs/crashes when vaSyncSurface() is called in places other than for the first time after an encode operat

Re: [Mesa-dev] [PATCH 03/88] glsl: Add initial functions to implement an on-disk cache

2016-09-26 Thread Timothy Arceri
On Mon, 2016-09-26 at 08:29 -0600, Brian Paul wrote: > On 09/23/2016 11:24 PM, Timothy Arceri wrote: > > > > From: Carl Worth > > > > This code provides for an on-disk cache of objects. Objects are > > stored > > and retrieved via names that are arbitrary 20-byte sequences, > > (intended to be S

Re: [Mesa-dev] [PATCH 10/12] genX/cmd_buffer: Enable rendering to HiZ

2016-09-26 Thread Chad Versace
On Mon 26 Sep 2016, Nanley Chery wrote: > On Mon, Sep 19, 2016 at 01:49:09PM -0700, Nanley Chery wrote: > > On Fri, Sep 02, 2016 at 03:16:21PM -0700, Chad Versace wrote: > > > On Wed 31 Aug 2016, Nanley Chery wrote: > > > > From: Chad Versace > > > > > > > > Nanley Chery: > > > > (rebase) > > > >

Re: [Mesa-dev] [PATCH v2 7/7] intel/isl: Add a detailed comment about multisampling with HiZ

2016-09-26 Thread Nanley Chery
On Mon, Sep 12, 2016 at 05:58:24PM -0700, Jason Ekstrand wrote: > Signed-off-by: Jason Ekstrand > Reviewed-by: Chad Versace > --- > src/intel/isl/isl.c | 60 > +++-- > 1 file changed, 58 insertions(+), 2 deletions(-) This patch is Reviewed-by: Na

Re: [Mesa-dev] [PATCH 03/88] glsl: Add initial functions to

2016-09-26 Thread Timothy Arceri
On Mon, 2016-09-26 at 08:42 -0700, Eric Anholt wrote: > Timothy Arceri writes: > > > > > On Sun, 2016-09-25 at 13:26 -0700, Eric Anholt wrote: > > > > > > Timothy Arceri writes: > > > > > > > > +static void > > > > +test_put_key_and_get_key(void) > > > > +{ > > > > +   struct program_cache *c

Re: [Mesa-dev] [PATCH 10/12] genX/cmd_buffer: Enable rendering to HiZ

2016-09-26 Thread Nanley Chery
On Mon, Sep 19, 2016 at 01:49:09PM -0700, Nanley Chery wrote: > On Fri, Sep 02, 2016 at 03:16:21PM -0700, Chad Versace wrote: > > On Wed 31 Aug 2016, Nanley Chery wrote: > > > From: Chad Versace > > > > > > Nanley Chery: > > > (rebase) > > > - Resolve conflicts with new anv_batch_emit macro > >

Re: [Mesa-dev] [PATCH] st/va: enable vbr rate control for vaapi encode

2016-09-26 Thread Andy Furniss
Andy Furniss wrote: Zhang, Boyuan wrote: For the overflow concern, unsigned int can handle about 4294Mbit/s, which we thought is big enough for real life cases, right? Yea, but it gets x 100 and my vce can do at least 2160p so for baseline higher than 42.94 mbit is not that extreme. OK so

[Mesa-dev] [PATCH] st/va: Fix vaSyncSurface with no outstanding operation

2016-09-26 Thread Mark Thompson
--- A simple fix to the problem described here: . With this applied, the driver no longer hangs/crashes when vaSyncSurface() is called in places other than for the first time after an encode operation (including a secon

Re: [Mesa-dev] [PATCH] st/va: enable vbr rate control for vaapi encode

2016-09-26 Thread Andy Furniss
Zhang, Boyuan wrote: Hi Andy, For the VBR target/max bit-rate, yes, this is gstreamer-vaapi's current design. User typed bit-rate is actually the max bit-rate not the actual bit-rate, which is a bit confused. Fair enough on the bitrate, though I am still a bit confused on the VBR being constra

[Mesa-dev] [PATCH] intel/blorp_blit: Simplify uncompressed level0 extent assignment

2016-09-26 Thread Nanley Chery
From: Nanley Chery These values are the same. Avoid the extra computation. Signed-off-by: Nanley Chery --- v2: Add a sample count assertion (Jason) src/intel/blorp/blorp_blit.c | 9 - 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/src/intel/blorp/blorp_blit.c b/src/int

Re: [Mesa-dev] [PATCH] st/va: enable vbr rate control for vaapi encode

2016-09-26 Thread Zhang, Boyuan
Hi Andy, For the VBR target/max bit-rate, yes, this is gstreamer-vaapi's current design. User typed bit-rate is actually the max bit-rate not the actual bit-rate, which is a bit confused. For the overflow concern, unsigned int can handle about 4294Mbit/s, which we thought is big enough for rea

Re: [Mesa-dev] [PATCH] intel/blorp_blit: Simplify uncompressed level0 extent assignment

2016-09-26 Thread Jason Ekstrand
I think this is correct given that this function is never called on a multisampled image. We should add an assert(samples == 1) somewhere just to be clear. On Sep 26, 2016 11:53 AM, "Nanley Chery" wrote: > These values are the same. Avoid the extra computation. > > Signed-off-by: Nanley Chery

Re: [Mesa-dev] [PATCH v2 6/6] nv50/ir: teach insnCanLoad() about SHLADD

2016-09-26 Thread Ilia Mirkin
Reviewed-by: Ilia Mirkin On Mon, Sep 26, 2016 at 5:02 PM, Samuel Pitoiset wrote: > Commutativity is not allowed with SHLADD, but src2 can accept > loads. To allow the load propagation pass to do its job, add a > special case like for SUCLAMP because src1 is always an immediate. > > This IMAD to

Re: [Mesa-dev] [PATCH v2 5/6] nv50/ir: optimize SHLADD(a, b, c) to MOV((a << b) + c)

2016-09-26 Thread Ilia Mirkin
Reviewed-by: Ilia Mirkin On Mon, Sep 26, 2016 at 5:02 PM, Samuel Pitoiset wrote: > Signed-off-by: Samuel Pitoiset > --- > src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cp

Re: [Mesa-dev] [PATCH v2 4/6] nv50/ir: optimize SHLADD(a, b, 0x0) to SHL(a, b)

2016-09-26 Thread Ilia Mirkin
Reviewed-by: Ilia Mirkin On Mon, Sep 26, 2016 at 5:02 PM, Samuel Pitoiset wrote: > Signed-off-by: Samuel Pitoiset > --- > src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 8 > 1 file changed, 8 insertions(+) > > diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peepho

Re: [Mesa-dev] [PATCH v2 3/6] nv50/ir: optimize IMAD to SHLADD in presence of power of 2

2016-09-26 Thread Ilia Mirkin
Reviewed-by: Ilia Mirkin On Mon, Sep 26, 2016 at 5:02 PM, Samuel Pitoiset wrote: > Only and only if src1 is a power of 2 we can replace IMAD by SHLADD. > > v2: - use non-negative values and use applyLog2() > > Signed-off-by: Samuel Pitoiset > --- > src/gallium/drivers/nouveau/codegen/nv50_ir_p

Re: [Mesa-dev] [PATCH v2 1/6] nv50/ir: add preliminary support for SHLADD

2016-09-26 Thread Ilia Mirkin
On Mon, Sep 26, 2016 at 5:09 PM, Ilia Mirkin wrote: > IMHO I'd drop the isFloatType() bs in isOpSupported() - that can never > be true, if it is, you're using the instruction very wrong. Otherwise > this is > > Reviewed-by: Ilia Mirkin > > On Mon, Sep 26, 2016 at 5:02 PM, Samuel Pitoiset > wrote

Re: [Mesa-dev] [PATCH v2 2/6] nvc0/ir: add emission for SHLADD

2016-09-26 Thread Ilia Mirkin
On Mon, Sep 26, 2016 at 5:02 PM, Samuel Pitoiset wrote: > Unfortunately, we can't use the emit helpers for GF100/GK110 > because src1 and src2 are swapped. > > v2: - s/emitSHLADD/emitISCADD for GM107 emitter > > Signed-off-by: Samuel Pitoiset > --- > .../drivers/nouveau/codegen/nv50_ir_emit_gk11

Re: [Mesa-dev] [PATCH 01/13] anv: Use blorp for VkCmdFillBuffer

2016-09-26 Thread Jason Ekstrand
On Sep 26, 2016 12:26 PM, "Nanley Chery" wrote: > > On Mon, Sep 26, 2016 at 12:12:32PM -0700, Jason Ekstrand wrote: > > On Sep 26, 2016 11:16 AM, "Nanley Chery" wrote: > > > > > > On Sun, Sep 25, 2016 at 09:59:00AM -0700, Jason Ekstrand wrote: > > > > Signed-off-by: Jason Ekstrand > > > > --- >

Re: [Mesa-dev] [PATCH v2 1/6] nv50/ir: add preliminary support for SHLADD

2016-09-26 Thread Ilia Mirkin
IMHO I'd drop the isFloatType() bs in isOpSupported() - that can never be true, if it is, you're using the instruction very wrong. Otherwise this is Reviewed-by: Ilia Mirkin On Mon, Sep 26, 2016 at 5:02 PM, Samuel Pitoiset wrote: > This instruction is available since SM20 (Fermi) and allow to d

[Mesa-dev] [PATCH v2 2/6] nvc0/ir: add emission for SHLADD

2016-09-26 Thread Samuel Pitoiset
Unfortunately, we can't use the emit helpers for GF100/GK110 because src1 and src2 are swapped. v2: - s/emitSHLADD/emitISCADD for GM107 emitter Signed-off-by: Samuel Pitoiset --- .../drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp | 53 ++ .../drivers/nouveau/codegen/nv50_ir_

[Mesa-dev] [PATCH v2 5/6] nv50/ir: optimize SHLADD(a, b, c) to MOV((a << b) + c)

2016-09-26 Thread Samuel Pitoiset
Signed-off-by: Samuel Pitoiset --- src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 3 +++ 1 file changed, 3 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp index cbbe34d..9875738 100644 ---

[Mesa-dev] [PATCH v2 1/6] nv50/ir: add preliminary support for SHLADD

2016-09-26 Thread Samuel Pitoiset
This instruction is available since SM20 (Fermi) and allow to do (a << b) + c in one shot. In some situations, IMAD should be replaced by SHLADD when b is a power of 2, and ADD+SHL should be replaced by SHLADD as well. v2: - fix up the commutative table on nv50/ir Signed-off-by: Samuel Pitoiset

[Mesa-dev] [PATCH v2 3/6] nv50/ir: optimize IMAD to SHLADD in presence of power of 2

2016-09-26 Thread Samuel Pitoiset
Only and only if src1 is a power of 2 we can replace IMAD by SHLADD. v2: - use non-negative values and use applyLog2() Signed-off-by: Samuel Pitoiset --- src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 7 +++ 1 file changed, 7 insertions(+) diff --git a/src/gallium/drivers/nouve

[Mesa-dev] [PATCH v2 6/6] nv50/ir: teach insnCanLoad() about SHLADD

2016-09-26 Thread Samuel Pitoiset
Commutativity is not allowed with SHLADD, but src2 can accept loads. To allow the load propagation pass to do its job, add a special case like for SUCLAMP because src1 is always an immediate. This IMAD to SHLADD optimization helps a bunch of shaders from Tomb Raider, Victor Vran, UE4 demos (+15% p

[Mesa-dev] [PATCH v2 4/6] nv50/ir: optimize SHLADD(a, b, 0x0) to SHL(a, b)

2016-09-26 Thread Samuel Pitoiset
Signed-off-by: Samuel Pitoiset --- src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 8 1 file changed, 8 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp index c9d5b5f..cbbe34d 10064

Re: [Mesa-dev] [PATCH] i965: Only emit 1 viewport when possible.

2016-09-26 Thread Anuj Phogat
On Mon, Sep 26, 2016 at 11:23 AM, Kenneth Graunke wrote: > In core profile, we support up to 16 viewports. However, in the > majority of cases, only 1 of them is actually used - we only need > the others if the last shader stage prior to the rasterizer writes > gl_ViewportIndex. > > Processing al

Re: [Mesa-dev] [PATCH] i965: Only emit 1 viewport when possible.

2016-09-26 Thread Eric Anholt
Kenneth Graunke writes: > In core profile, we support up to 16 viewports. However, in the > majority of cases, only 1 of them is actually used - we only need > the others if the last shader stage prior to the rasterizer writes > gl_ViewportIndex. > > Processing all 16 viewports adds additional C

Re: [Mesa-dev] [PATCH 01/13] anv: Use blorp for VkCmdFillBuffer

2016-09-26 Thread Nanley Chery
On Mon, Sep 26, 2016 at 12:12:32PM -0700, Jason Ekstrand wrote: > On Sep 26, 2016 11:16 AM, "Nanley Chery" wrote: > > > > On Sun, Sep 25, 2016 at 09:59:00AM -0700, Jason Ekstrand wrote: > > > Signed-off-by: Jason Ekstrand > > > --- > > > src/intel/vulkan/anv_blorp.c | 106 >

Re: [Mesa-dev] [PATCH 01/13] anv: Use blorp for VkCmdFillBuffer

2016-09-26 Thread Jason Ekstrand
On Sep 26, 2016 11:16 AM, "Nanley Chery" wrote: > > On Sun, Sep 25, 2016 at 09:59:00AM -0700, Jason Ekstrand wrote: > > Signed-off-by: Jason Ekstrand > > --- > > src/intel/vulkan/anv_blorp.c | 106 + > > src/intel/vulkan/anv_meta_clear.c | 120 ---

Re: [Mesa-dev] [PATCH 2/2] i965: use L3 data cache for SSBOs

2016-09-26 Thread Jason Ekstrand
Looks good to me. Curro, do you see anything wrong with this? --Jason On Sep 26, 2016 7:31 AM, "Lionel Landwerlin" wrote: > Anv programs the hardware to use L3 data cache if we use either SSBOs or > images in the shaders, we can program i965 the same way. > > gl_shader_program has a bit of a co

Re: [Mesa-dev] [PATCH 1/2] i965: drop copy of NumImages

2016-09-26 Thread Jason Ekstrand
Not a big fan. This makes the prog_data structures less self-contained. You shouldn't have to look up an almost unrelated structure in order to figure out how big this one is. Also, I've been trying to move us in the direction of *more* stuff in prog_data, not less, so that we aren't looking up th

Re: [Mesa-dev] [PATCH 1/5] glsl: move some uniform linking code to new link_setup_uniform_remap_tables()

2016-09-26 Thread Kenneth Graunke
On Sunday, September 25, 2016 10:50:24 PM PDT Timothy Arceri wrote: > This makes link_assign_uniform_locations() easier to follow. > --- > src/compiler/glsl/link_uniforms.cpp | 330 > +++- > src/compiler/glsl/linker.cpp| 4 +- > src/compiler/glsl/linker.h

[Mesa-dev] [PATCH] intel/blorp_blit: Simplify uncompressed level0 extent assignment

2016-09-26 Thread Nanley Chery
These values are the same. Avoid the extra computation. Signed-off-by: Nanley Chery --- src/intel/blorp/blorp_blit.c | 7 ++- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/src/intel/blorp/blorp_blit.c b/src/intel/blorp/blorp_blit.c index af46389..1c878e8 100644 --- a/src/inte

Re: [Mesa-dev] [PATCH 02/13] anv/meta: Roll clear_image into CmdClearDepthStencilImage

2016-09-26 Thread Nanley Chery
On Sun, Sep 25, 2016 at 09:59:01AM -0700, Jason Ekstrand wrote: > It is now the only caller so there's no sense in keeping things split out. > > Signed-off-by: Jason Ekstrand > --- > src/intel/vulkan/anv_meta_clear.c | 84 > +-- > 1 file changed, 28 insertion

[Mesa-dev] [PATCH] i965: Only emit 1 viewport when possible.

2016-09-26 Thread Kenneth Graunke
In core profile, we support up to 16 viewports. However, in the majority of cases, only 1 of them is actually used - we only need the others if the last shader stage prior to the rasterizer writes gl_ViewportIndex. Processing all 16 viewports adds additional CPU overhead, which hurts CPU-intensiv

Re: [Mesa-dev] [PATCH v3 13/14] nvc0: expose ARB_compute_variable_group_size

2016-09-26 Thread Samuel Pitoiset
On 09/26/2016 07:27 PM, Ilia Mirkin wrote: FWIW this limits it to 32 regs on Fermi. IMO that's pretty limiting, esp given how shitty our RA is. I think we should do 512 for Fermi and 1024 for Kepler+. [A matching adjustment will be needed in codegen.] Yep, I will improve it, but this can be d

Re: [Mesa-dev] [PATCH 01/13] anv: Use blorp for VkCmdFillBuffer

2016-09-26 Thread Nanley Chery
On Sun, Sep 25, 2016 at 09:59:00AM -0700, Jason Ekstrand wrote: > Signed-off-by: Jason Ekstrand > --- > src/intel/vulkan/anv_blorp.c | 106 + > src/intel/vulkan/anv_meta_clear.c | 120 > -- > 2 files changed, 96 insertions(

[Mesa-dev] [Bug 97879] [amdgpu] Rocket League: long hangs (several seconds) when loading assets (models/textures/shaders?)

2016-09-26 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=97879 --- Comment #22 from Silvan Jegen --- Created attachment 126796 --> https://bugs.freedesktop.org/attachment.cgi?id=126796&action=edit perf report of RocketLeague stalling/freezing -- You are receiving this mail because: You are the assignee f

[Mesa-dev] [Bug 97879] [amdgpu] Rocket League: long hangs (several seconds) when loading assets (models/textures/shaders?)

2016-09-26 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=97879 --- Comment #21 from Silvan Jegen --- (In reply to Eero Tamminen from comment #20) > Best would be to do (e.g. from SSH console): > # perf record -a > ^C > > During the game freeze. I have a dual screen setup and ran 'perf record -a' on a t

Re: [Mesa-dev] [PATCH V2] anv/blorp: Handle zero width/height blits in blorp_copy()

2016-09-26 Thread Nanley Chery
On Mon, Sep 26, 2016 at 10:22:43AM -0700, Anuj Phogat wrote: > V2: Move the check from copy_buffer_to_image() to blorp_copy(). (Nanley) > > Signed-off-by: Anuj Phogat > Cc: Nanley Chery > --- This patch is Reviewed-by: Nanley Chery > src/intel/blorp/blorp_blit.c | 5 - > 1 file changed,

[Mesa-dev] [PATCH V2] anv/blorp: Handle zero width/height blits in blorp_copy()

2016-09-26 Thread Anuj Phogat
V2: Move the check from copy_buffer_to_image() to blorp_copy(). (Nanley) Signed-off-by: Anuj Phogat Cc: Nanley Chery --- src/intel/blorp/blorp_blit.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/src/intel/blorp/blorp_blit.c b/src/intel/blorp/blorp_blit.c index af46389

Re: [Mesa-dev] [PATCH v3 13/14] nvc0: expose ARB_compute_variable_group_size

2016-09-26 Thread Ilia Mirkin
FWIW this limits it to 32 regs on Fermi. IMO that's pretty limiting, esp given how shitty our RA is. I think we should do 512 for Fermi and 1024 for Kepler+. [A matching adjustment will be needed in codegen.] On Mon, Sep 26, 2016 at 1:23 PM, Samuel Pitoiset wrote: > Let's return the same number o

[Mesa-dev] [PATCH v3 12/14] nv50/ir: use 1024 threads/block for variable local size

2016-09-26 Thread Samuel Pitoiset
When a variable local size is defined as specified by ARB_compute_variable_group_size, the fixed local size is set to 0 and a SIGFPE occurs when we compute the maximum number of regs. Signed-off-by: Samuel Pitoiset --- src/gallium/drivers/nouveau/codegen/nv50_ir_target.h | 3 ++- 1 file changed,

[Mesa-dev] [PATCH v3 14/14] docs: mark ARB_compute_variable_group_size as done for nvc0

2016-09-26 Thread Samuel Pitoiset
Signed-off-by: Samuel Pitoiset --- docs/features.txt | 2 +- docs/relnotes/12.1.0.html | 1 + 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/docs/features.txt b/docs/features.txt index fbb3952..6cc429a 100644 --- a/docs/features.txt +++ b/docs/features.txt @@ -279,7 +279,7

[Mesa-dev] [PATCH v3 08/14] gallium: add PIPE_COMPUTE_CAP_MAX_VARIABLE_THREADS_PER_BLOCK

2016-09-26 Thread Samuel Pitoiset
v3: - use a new case statement in r600_pipe_common.c - fix compilation of softpipe... Signed-off-by: Samuel Pitoiset --- src/gallium/docs/source/screen.rst | 4 src/gallium/drivers/ilo/ilo_screen.c | 2 ++ src/gallium/drivers/nouveau/nv50/nv50_screen.c | 2 ++ src/

[Mesa-dev] [PATCH v3 11/14] st/mesa: expose ARB_compute_variable_group_size

2016-09-26 Thread Samuel Pitoiset
This extension is only exposed if the underlying driver supports ARB_compute_shader and if PIPE_COMPUTE_MAX_VARIABLE_THREADS_PER_BLOCK is set. v3: - initialize max_variable_threads_per_block to 0 v2: - expose the ext based on that new cap Signed-off-by: Samuel Pitoiset --- src/mesa/state_tracke

[Mesa-dev] [PATCH v3 09/14] st/mesa: add mapping for SYSTEM_VALUE_LOCAL_GROUP_SIZE

2016-09-26 Thread Samuel Pitoiset
gl_LocalGroupSizeARB can be translated into TGSI_SEMANTIC_BLOCK_SIZE which represents the block size in threads. Signed-off-by: Samuel Pitoiset Reviewed-by: Marek Olšák --- src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 2 ++ 1 file changed, 2 insertions(+) diff --git a/src/mesa/state_tracker/st

[Mesa-dev] [PATCH v3 03/14] glsl: add enable flags for ARB_compute_variable_group_size

2016-09-26 Thread Samuel Pitoiset
This also initializes the default values for the standalone compiler. Signed-off-by: Samuel Pitoiset Reviewed-by: Ian Romanick --- src/compiler/glsl/glsl_parser_extras.cpp | 1 + src/compiler/glsl/glsl_parser_extras.h | 2 ++ src/compiler/glsl/standalone.cpp | 4 src/

[Mesa-dev] [PATCH v3 13/14] nvc0: expose ARB_compute_variable_group_size

2016-09-26 Thread Samuel Pitoiset
Let's return the same number of threads per block for both fixed and variable sizes. Signed-off-by: Samuel Pitoiset --- src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c b/src/galli

[Mesa-dev] [PATCH v3 10/14] st/mesa: add support for dispatching a variable local size

2016-09-26 Thread Samuel Pitoiset
Signed-off-by: Samuel Pitoiset Reviewed-by: Marek Olšák --- src/mesa/state_tracker/st_cb_compute.c | 15 --- 1 file changed, 12 insertions(+), 3 deletions(-) diff --git a/src/mesa/state_tracker/st_cb_compute.c b/src/mesa/state_tracker/st_cb_compute.c index 88c1ee2..ccc5dc2 100644 -

[Mesa-dev] [PATCH v3 06/14] glsl/linker: handle errors when a variable local size is used

2016-09-26 Thread Samuel Pitoiset
Compute shaders can now include a fixed local size as defined by ARB_compute_shader or a variable size as defined by ARB_compute_variable_group_size. v2: - update formatting spec quotations (Ian) - various cosmetic changes (Ian) Signed-off-by: Samuel Pitoiset Reviewed-by: Ian Romanick ---

[Mesa-dev] [PATCH v3 05/14] glsl: reject compute shaders with fixed and variable local size

2016-09-26 Thread Samuel Pitoiset
The ARB_compute_variable_group_size specification explains that when a compute shader includes both a fixed and a variable local size, a compile-time error occurs. v2: - update formatting spec quotations (Ian) Signed-off-by: Samuel Pitoiset --- src/compiler/glsl/ast_to_hir.cpp | 14

[Mesa-dev] [PATCH v3 07/14] glsl: add gl_LocalGroupSizeARB as a system value

2016-09-26 Thread Samuel Pitoiset
v2: - only add it if the ext is enabled (Ilia) Signed-off-by: Samuel Pitoiset Reviewed-by: Ian Romanick --- src/compiler/glsl/builtin_variables.cpp | 6 ++ src/compiler/shader_enums.h | 1 + 2 files changed, 7 insertions(+) diff --git a/src/compiler/glsl/builtin_variables.cpp

[Mesa-dev] [PATCH v3 00/14] add support for ARB_compute_variable_group_size

2016-09-26 Thread Samuel Pitoiset
v3: - use a new case statement in r600_pipe_common.c - fix compilation with softpipe - initialize max_variable_threads_per_block to 0 v2: - update formatting spec quotations - add PIPE_COMPUTE_CAP_MAX_VARIABLE_THREADS_PER_BLOCK - expose the ext based on that new cap - add missi

[Mesa-dev] [PATCH v3 01/14] glapi: add entry points for GL_ARB_compute_variable_group_size

2016-09-26 Thread Samuel Pitoiset
v2: - correctly sort that new extension (Ian) - fix up the comment (Ian) Signed-off-by: Samuel Pitoiset Reviewed-by: Ian Romanick --- .../glapi/gen/ARB_compute_variable_group_size.xml | 25 ++ src/mapi/glapi/gen/Makefile.am | 1 + src/mapi/glapi/gen

[Mesa-dev] [PATCH v3 04/14] glsl: process local_size_variable input qualifier

2016-09-26 Thread Samuel Pitoiset
This is the new layout qualifier introduced by ARB_compute_variable_group_size which allows to use a variable work group size. Signed-off-by: Samuel Pitoiset Reviewed-by: Ian Romanick --- src/compiler/glsl/ast.h | 5 + src/compiler/glsl/ast_type.cpp | 6 ++

[Mesa-dev] [PATCH v3 02/14] mesa/main: add support for ARB_compute_variable_groups_size

2016-09-26 Thread Samuel Pitoiset
v2: - update formatting spec quotations (Ian) - move the total_invocations check outside of the loop (Ian) Signed-off-by: Samuel Pitoiset --- src/mesa/main/api_validate.c | 96 src/mesa/main/api_validate.h | 4 ++ src/mesa/main/compute.c

Re: [Mesa-dev] [PATCH 03/88] glsl: Add initial functions to

2016-09-26 Thread Eric Anholt
Timothy Arceri writes: > On Sun, 2016-09-25 at 13:26 -0700, Eric Anholt wrote: >> Timothy Arceri writes: >> > +static void >> > +test_put_key_and_get_key(void) >> > +{ >> > +   struct program_cache *cache; >> > +   bool result; >> > + >> > +   uint8_t key_a[20] = {  0,  1,  2,  3,  4,  5,  6,  7

Re: [Mesa-dev] [PATCH 2/2] st/mesa: enable ARB_ES3_2_compatibility when enough available

2016-09-26 Thread Marek Olšák
For the series: Acked-by: Marek Olšák Marek On Fri, Sep 23, 2016 at 2:52 AM, Ilia Mirkin wrote: > ping > > On Tue, Sep 13, 2016 at 8:54 PM, Ilia Mirkin wrote: >> Signed-off-by: Ilia Mirkin >> --- >> src/mesa/state_tracker/st_extensions.c | 20 >> 1 file changed, 20 inse

Re: [Mesa-dev] [RFC] egl: stop claiming support for pbuffer + msaa (RFC)

2016-09-26 Thread Marek Olšák
Sounds good to me. I think only legacy applications would use pbuffers. There is no reason to use pbuffers on anything that has GL_ARB_framebuffer_object (pbuffers were use to do render-to-texture when FBOs didn't exist). Reviewed-by: Marek Olšák Marek On Mon, Sep 26, 2016 at 9:41 AM, Tapani Pä

Re: [Mesa-dev] Was: Re: [PATCH] r600g: Add support for PK2H/UP2H

2016-09-26 Thread Marek Olšák
Pushed. Thanks for the reminder. Marek On Wed, Sep 21, 2016 at 11:20 PM, Dieter Nützel wrote: > Ping. - Again. > > Ilia and Marek voted for it. > > Any progress? > Anyone, Marek, Nicolai? > Should I rebase? > > Dieter > >> [Mesa-dev] [PATCH] r600g: Add support for PK2H/UP2H >> >> Glenn Kennard g

[Mesa-dev] [PATCH 2/2] i965: use L3 data cache for SSBOs

2016-09-26 Thread Lionel Landwerlin
Anv programs the hardware to use L3 data cache if we use either SSBOs or images in the shaders, we can program i965 the same way. gl_shader_program has a bit of a confusing named field with 'NumAtomicBuffers'. It doesn't tell how many buffers are accessed by the shader in an atomic way but instead

[Mesa-dev] [PATCH 1/2] i965: drop copy of NumImages

2016-09-26 Thread Lionel Landwerlin
We can access this value through gl_shader_program. Signed-off-by: Lionel Landwerlin Cc: Jason Ekstrand --- src/mesa/drivers/dri/i965/brw_compiler.h | 1 - src/mesa/drivers/dri/i965/brw_cs.c| 1 - src/mesa/drivers/dri/i965/brw_gs.c| 1 - src/mesa/drivers/dri/i965/brw_tcs.c

Re: [Mesa-dev] [PATCH 03/88] glsl: Add initial functions to implement an on-disk cache

2016-09-26 Thread Brian Paul
On 09/23/2016 11:24 PM, Timothy Arceri wrote: From: Carl Worth This code provides for an on-disk cache of objects. Objects are stored and retrieved via names that are arbitrary 20-byte sequences, (intended to be SHA-1 hashes of something identifying for the content). The directory used for the

Re: [Mesa-dev] [PATCH] v2 st/va Avoid VBR bitrate calculation overflow

2016-09-26 Thread Christian König
Am 26.09.2016 um 11:44 schrieb Andy Furniss: VBR bitrate calc needs 64 bits at high rates. v2 use float. Signed-off-by: Andy Furniss Reviewed-by: Christian König . Since Leo is on vacation I will probably collect all remaining mesa patches and commit them later today. Christian. --- s

[Mesa-dev] [Bug 97879] [amdgpu] Rocket League: long hangs (several seconds) when loading assets (models/textures/shaders?)

2016-09-26 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=97879 --- Comment #20 from Eero Tamminen --- Apitrace's own CPU overhead is so high that it's not very good for identifying CPU bottlenecks. Best would be to do (e.g. from SSH console): # perf record -a ^C During the game freeze. And provide pro

Re: [Mesa-dev] [PATCH 10/15] glsl/standalone: Optimize dead variable declarations

2016-09-26 Thread Tapani Pälli
On 09/16/2016 01:12 AM, Ian Romanick wrote: From: Ian Romanick We didn't bother with this in the regular compiler because it doesn't change the generated code. In the stand-alone compiler, this can clutter the output with useless variables. It's especially bad after functions are inlined bu

Re: [Mesa-dev] [PATCH mesa 4/4] nir/spirv: add spirv2nir binary to .gitignore

2016-09-26 Thread Eric Engestrom
On Sun, Sep 25, 2016 at 10:49:29AM -0700, Jason Ekstrand wrote: > I hope you realize that this is the only truly useful change in the series. > :-). Still, no reason why our silly little helpers shouldn't be correct. Yeah, I know :P I got the Coverity report like everyone and thought we might as w

Re: [Mesa-dev] [PATCH] st/va: enable vbr rate control for vaapi encode

2016-09-26 Thread Andy Furniss
Andy Furniss wrote: Andy Furniss wrote: Andy Furniss wrote: https://patchwork.freedesktop.org/patch/112040/ Hmm that got mungled I'll try again later going to be AFK for a while. This one worked. https://patchwork.freedesktop.org/patch/112069/ Or maybe a version that uses float - I don'

[Mesa-dev] [PATCH] v2 st/va Avoid VBR bitrate calculation overflow

2016-09-26 Thread Andy Furniss
VBR bitrate calc needs 64 bits at high rates. v2 use float. Signed-off-by: Andy Furniss --- src/gallium/state_trackers/va/picture.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/gallium/state_trackers/va/picture.c b/src/gallium/state_trackers/va/picture.c index 7f3d96d

[Mesa-dev] [RFC] egl: stop claiming support for pbuffer + msaa (RFC)

2016-09-26 Thread Tapani Pälli
This fixes a crash in egl-create-msaa-pbuffer-surface Piglit test and same crash in many dEQP EGL tests. I also found that some Qt example did a workaround because of this crash: https://bugreports.qt.io/browse/QTBUG-47509 Signed-off-by: Tapani Pälli --- This is RFC as I'm not sure if we are su

Re: [Mesa-dev] [llvm] r282237 - [InstCombine] Fix for PR29124: reduce insertelements to shufflevector

2016-09-26 Thread Michel Dänzer
Hi Alexey, On 23/09/16 06:14 PM, Alexey Bataev via llvm-commits wrote: > Author: abataev > Date: Fri Sep 23 04:14:08 2016 > New Revision: 282237 > > URL: http://llvm.org/viewvc/llvm-project?rev=282237&view=rev > Log: > [InstCombine] Fix for PR29124: reduce insertelements to shufflevector This