On Tue, Mar 29, 2016 at 8:49 PM, Matt Turner wrote:
> On Fri, Mar 25, 2016 at 4:12 PM, Jason Ekstrand
> wrote:
> > ---
> > src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 32
> ++
> > src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
On Saturday, April 2, 2016 3:03:57 PM PDT Timothy Arceri wrote:
> Previously we store the buffer block index i.e the index of a combined
> ubo/ssbo list.
>
> Fixes several dEQP-GLES31.functional tests:
> - program_interface_query.uniform.block_index.block_array
> -
On Saturday, April 2, 2016 3:03:56 PM PDT Timothy Arceri wrote:
> This allows us to simplify the code and drop InterfaceBlockStageIndex
> which is a per stage array of integers the size of all blocks in the
> program combined including duplicates across stages. Adding a stage
> ref per block will
On Saturday, April 2, 2016 3:03:55 PM PDT Timothy Arceri wrote:
> This changes the code to use the buffer counts stored for each stage
> rather than counting from scratch. It also moves the checks outside
> of the for loop which means we now just get a single link error
> message if we go over the
On Saturday, April 2, 2016 3:03:54 PM PDT Timothy Arceri wrote:
> We already have a count of active SSBOs per stage so use it.
> ---
> src/compiler/glsl/linker.cpp | 8 +---
> 1 file changed, 1 insertion(+), 7 deletions(-)
>
> diff --git a/src/compiler/glsl/linker.cpp
On Saturday, April 2, 2016 3:03:52 PM PDT Timothy Arceri wrote:
> Since 8683d54d2be825 there is now a single instance of the buffer
> block information that needs to be updated rather than one instance
> for each stage.
> ---
> src/compiler/glsl/link_uniform_initializers.cpp | 32
Series is
Reviewed-by: Ilia Mirkin
On Sat, Apr 2, 2016 at 12:17 AM, Kenneth Graunke wrote:
> If the GL_ARB_shader_draw_parameters extension is enabled, we'll already
> have a gl_BaseVertex variable. It will have var->how_declared set to
>
On Thursday, March 31, 2016 3:00:37 PM PDT Ilia Mirkin wrote:
> On Thu, Mar 31, 2016 at 2:53 PM, Kenneth Graunke
wrote:
> > System values are just built-in input variables that we've opted to
> > special-case out of convenience. We need to consider all inputs,
> >
We occasionally generate variables internally that we want to exclude
from the program resource list, as applications won't be expecting them
to be present.
The next patch will make use of this.
Signed-off-by: Kenneth Graunke
---
src/compiler/glsl/linker.cpp | 2 +-
1
If the GL_ARB_shader_draw_parameters extension is enabled, we'll already
have a gl_BaseVertex variable. It will have var->how_declared set to
ir_var_declared_implicitly, and will appear in the program resource
list.
If not, we make one for internal use. We don't want it to be listed
in the
This allows us to simplify the code and drop InterfaceBlockStageIndex
which is a per stage array of integers the size of all blocks in the
program combined including duplicates across stages. Adding a stage
ref per block will use less memory.
---
src/compiler/glsl/linker.cpp | 20
Since 8683d54d2be825 there is now a single instance of the buffer
block information that needs to be updated rather than one instance
for each stage.
---
src/compiler/glsl/link_uniform_initializers.cpp | 32 +
1 file changed, 6 insertions(+), 26 deletions(-)
diff --git
This will allow us to use them when checking resources in a
following patch and clean up a bunch of code.
---
src/compiler/glsl/linker.cpp | 54 ++--
1 file changed, 27 insertions(+), 27 deletions(-)
diff --git a/src/compiler/glsl/linker.cpp
is_ubo_var is true for both UBOs and SSBOs
---
src/compiler/glsl/link_uniforms.cpp | 8
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/src/compiler/glsl/link_uniforms.cpp
b/src/compiler/glsl/link_uniforms.cpp
index c84bb2f..3728d7d 100644
---
Previously we store the buffer block index i.e the index of a combined
ubo/ssbo list.
Fixes several dEQP-GLES31.functional tests:
- program_interface_query.uniform.block_index.block_array
- program_interface_query.uniform.block_index.named_block
-
We already have a count of active SSBOs per stage so use it.
---
src/compiler/glsl/linker.cpp | 8 +---
1 file changed, 1 insertion(+), 7 deletions(-)
diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp
index 649fb7c..c9eaa6b 100644
--- a/src/compiler/glsl/linker.cpp
+++
This changes the code to use the buffer counts stored for each stage
rather than counting from scratch. It also moves the checks outside
of the for loop which means we now just get a single link error
message if we go over the max rather than X error messages where X
is the number we have exceeded
I did toy with the idea of adding a
DONT_REALLY_FLUSH_JUST_CREATE_A_FENCE flag to existing pctx->flush()..
I'm not really sure I could call that better. And either way, we want
the ctx ptr in fence_finish(), otherwise we are hiding the ctx ptr
internally in the driver's pipe_fence struct so that
Build mesa 777 completed
Commit ef1b397b07 by Jordan Justen on 3/12/2016 12:44 AM:
glsl: Don't require matching centroid qualifiers\n\nNote: This patch appears to violate older OpenGL and OpenGLES specs.\n\nThe OpenGLES GLSL 3.1 and OpenGL GLSL 4.3
Build mesa 776 failed
Commit cc1320220f by Jason Ekstrand on 4/1/2016 10:44 PM:
nir/gather_info: Add an assert for supported stages
Configure your notification preferences
___
mesa-dev mailing list
---
src/gallium/drivers/swr/swr_shader.cpp | 98 +++---
src/gallium/drivers/swr/swr_shader.h | 41 +---
src/gallium/drivers/swr/swr_state.cpp | 153 +++--
src/gallium/drivers/swr/swr_state.h| 6 +-
---
src/gallium/drivers/swr/swr_screen.cpp | 4
1 file changed, 4 insertions(+)
diff --git a/src/gallium/drivers/swr/swr_screen.cpp
b/src/gallium/drivers/swr/swr_screen.cpp
index f9e52be..4a4992f 100644
--- a/src/gallium/drivers/swr/swr_screen.cpp
+++
---
src/gallium/drivers/swr/rasterizer/common/os.h | 6 --
1 file changed, 6 deletions(-)
diff --git a/src/gallium/drivers/swr/rasterizer/common/os.h
b/src/gallium/drivers/swr/rasterizer/common/os.h
index 5794f3f..427ebc1 100644
--- a/src/gallium/drivers/swr/rasterizer/common/os.h
+++
---
src/gallium/drivers/swr/rasterizer/core/api.cpp| 8 +-
src/gallium/drivers/swr/rasterizer/core/context.h | 38 +++
.../drivers/swr/rasterizer/core/threads.cpp| 126 +++--
src/gallium/drivers/swr/rasterizer/core/threads.h | 4 +-
No perf loss detected
---
src/gallium/drivers/swr/rasterizer/core/frontend.cpp | 10 --
src/gallium/drivers/swr/rasterizer/core/pa.h | 10 --
src/gallium/drivers/swr/rasterizer/core/threads.cpp | 5 -
src/gallium/drivers/swr/rasterizer/core/tilemgr.h| 2 +-
4
---
src/gallium/drivers/swr/rasterizer/core/api.cpp | 16 ++--
src/gallium/drivers/swr/rasterizer/core/arena.h | 2 +-
src/gallium/drivers/swr/rasterizer/core/backend.cpp | 2 +-
3 files changed, 16 insertions(+), 4 deletions(-)
diff --git
- Check for unused blocks every few frames or every 64K draws
- Delete data unused since the last check if total unused data is > 20MB
Doesn't seem to cause a perf degridation
---
src/gallium/drivers/swr/rasterizer/core/api.cpp | 17 +-
src/gallium/drivers/swr/rasterizer/core/arena.h | 230
---
src/gallium/drivers/swr/rasterizer/core/api.cpp| 20 +++
.../drivers/swr/rasterizer/core/backend.cpp| 31 ++---
src/gallium/drivers/swr/rasterizer/core/context.h | 2 ++
.../drivers/swr/rasterizer/core/depthstencil.h | 40 +-
---
src/gallium/drivers/swr/rasterizer/core/threads.cpp | 4
1 file changed, 4 insertions(+)
diff --git a/src/gallium/drivers/swr/rasterizer/core/threads.cpp
b/src/gallium/drivers/swr/rasterizer/core/threads.cpp
index 1a11175..056003e 100644
---
More development of the swr rasterizer.
Tim Rowley (11):
swr: [rasterizer] Misc fixes identified by static code analysis
swr: [rasterizer core] Affinitize thread scratch space to numa node of
worker
swr: [rasterizer common] win32 build fixups
swr: [rasterizer core] Quantize depth to
No need for 256 pointers per DC.
---
src/gallium/drivers/swr/rasterizer/core/api.cpp| 31 +++---
.../drivers/swr/rasterizer/core/backend.cpp| 8 +++---
src/gallium/drivers/swr/rasterizer/core/backend.h | 2 +-
src/gallium/drivers/swr/rasterizer/core/context.h | 16
---
src/gallium/drivers/swr/rasterizer/common/os.h | 4 +---
src/gallium/drivers/swr/rasterizer/core/knobs.h | 3 +++
2 files changed, 4 insertions(+), 3 deletions(-)
diff --git a/src/gallium/drivers/swr/rasterizer/common/os.h
b/src/gallium/drivers/swr/rasterizer/common/os.h
index
Future proofing
---
src/gallium/drivers/swr/rasterizer/core/clip.cpp | 4 ++--
src/gallium/drivers/swr/rasterizer/core/clip.h | 4 ++--
src/gallium/drivers/swr/rasterizer/core/pa.h | 4 ++--
src/gallium/drivers/swr/rasterizer/core/rasterizer.cpp | 16
4
---
src/gallium/drivers/swr/rasterizer/core/api.cpp | 8
src/gallium/drivers/swr/rasterizer/core/tilemgr.cpp | 1 -
2 files changed, 4 insertions(+), 5 deletions(-)
diff --git a/src/gallium/drivers/swr/rasterizer/core/api.cpp
b/src/gallium/drivers/swr/rasterizer/core/api.cpp
index
I'll admit I'm not an expert on this but I got a bad feeling on it.
Do you really need another per-context fence_finish function?
This looks to me like rather than improving the existing api, it throws
another one at the same problem, which is to be sort of used in parallel
making things confusing
Found with grep and inspection. Test compiled on RPi hw.
Assists any future effort to remove TGSI as an intermediate stage.
Signed-off-by: Rhys Kidd
---
src/gallium/drivers/vc4/vc4_program.c | 1 -
1 file changed, 1 deletion(-)
diff --git
This patch enables an EGL extension, EGL_KHR_reusable_sync.
This new extension basically provides a way for multiple APIs or
threads to be excuted synchronously via a "reusable sync"
primitive shared by those threads/API calls.
This was implemented based on the specification at
Patches 1-4 and 6-9 are:
Reviewed-by: Timothy Arceri
Patch 10 will give the correct result as discussed however I wouldn't
mind seeing if I can come up with a patch that avoids the extra
processing at query time. We really need to clean up these structures.
Reviewed-by: Rob Clark
Cc: Kenneth Graunke
v2: Pull get_io_offset into nir_gather_info and add an assert that our
shader is for one of the supported stages. (Ken)
---
src/compiler/Makefile.sources | 1 +
src/compiler/nir/Makefile.sources
On Fri, Apr 1, 2016 at 6:50 PM, Bas Nieuwenhuizen
wrote:
> I will change that to
>
> TGSI_PROPERTY_CS_FIXED_BLOCK_WIDTH etc.
>
> since most other properties, seem to use S instead of P,
> unless you have any objections.
Indeed they do - no objections from me.
>
> - Bas
I will change that to
TGSI_PROPERTY_CS_FIXED_BLOCK_WIDTH etc.
since most other properties, seem to use S instead of P,
unless you have any objections.
- Bas
On Sat, Apr 2, 2016 at 12:37 AM, Ilia Mirkin wrote:
> On Fri, Apr 1, 2016 at 6:32 PM, Bas Nieuwenhuizen
>
Reviewed-by: Ilia Mirkin
On Fri, Apr 1, 2016 at 6:32 PM, Bas Nieuwenhuizen
wrote:
> Signed-off-by: Bas Nieuwenhuizen
> ---
> src/gallium/drivers/trace/tr_dump_state.c | 4 +++-
>
On Fri, Apr 1, 2016 at 6:37 PM, Ilia Mirkin wrote:
> On Fri, Apr 1, 2016 at 6:32 PM, Bas Nieuwenhuizen
> wrote:
>> The value 0 for unknown has been chosen to so that
>> drivers using tgsi_scan_shader do not need to detect
>> missing properties if
On Fri, Apr 1, 2016 at 6:32 PM, Bas Nieuwenhuizen
wrote:
> For radeonsi, native and TGSI use different compilers and this results
> in different limits for different IR's.
>
> The set we strictly need for radeonsi is only the MAX_BLOCK_SIZE
> and MAX_THREADS_PER_BLOCK
On Fri, Apr 1, 2016 at 6:32 PM, Bas Nieuwenhuizen
wrote:
> The value 0 for unknown has been chosen to so that
> drivers using tgsi_scan_shader do not need to detect
> missing properties if they zero-initialize the struct.
>
> Signed-off-by: Bas Nieuwenhuizen
For radeonsi, native and TGSI use different compilers and this results
in different limits for different IR's.
The set we strictly need for radeonsi is only the MAX_BLOCK_SIZE
and MAX_THREADS_PER_BLOCK params, but I added a few others as shader
related that seemed like they would also typically
Signed-off-by: Bas Nieuwenhuizen
---
src/gallium/drivers/trace/tr_dump_state.c | 4 +++-
src/gallium/include/pipe/p_state.h| 1 +
src/gallium/state_trackers/clover/core/kernel.cpp | 1 +
src/gallium/tests/trivial/compute.c | 1 +
The value 0 for unknown has been chosen to so that
drivers using tgsi_scan_shader do not need to detect
missing properties if they zero-initialize the struct.
Signed-off-by: Bas Nieuwenhuizen
---
src/gallium/auxiliary/tgsi/tgsi_strings.c | 3 +++
Currently radeonsi synchronizes after every dispatch and Clover
does nothing to synchronize. This is overzealous, especially with
GL compute, so add a barrier for global buffers.
Signed-off-by: Bas Nieuwenhuizen
---
src/gallium/include/pipe/p_defines.h | 1
The XMesaVisual instances freed in the visuals table on display close
are being freed with a free() call, instead of XMesaDestroyVisual(),
causing a memory leak.
---
src/mesa/drivers/x11/fakeglx.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/mesa/drivers/x11/fakeglx.c
Hi,
minor nitpicks wrt ordering below
Am 01.04.2016 um 02:04 schrieb Dylan Baker:
> Completely clean the imports:
> - Split so that one module is imported per line
> - Remove unused imports
> - Group stdlib imports, then 3rd party modules, and finally local
> modules
> - sort alphabetically
On Thu, Mar 31, 2016 at 2:59 AM, Samuel Iglesias Gonsálvez <
sigles...@igalia.com> wrote:
> Hello,
>
> This is the second version of this patch series [0].
>
> In case you prefer a repository, it is available here [1]:
>
> $ git clone -b nir-bit-size-fixes-2.0 https://github.com/Igalia/mesa.git
>
From: Rob Clark
This enables gallium support for EGL_ANDROID_native_fence_sync, for
drivers which support PIPE_CAP_NATIVE_FENCE_FD.
TODO: add PIPE_CAP_NATIVE_FENCE_FD to every's switch statement returning
false.. but I'll leave that until this patchset is ready to push
On Thu, Mar 31, 2016 at 3:00 AM, Samuel Iglesias Gonsálvez <
sigles...@igalia.com> wrote:
> From: Iago Toral Quiroga
>
> Found while testing UBO loads in scenarios like this:
>
> (assign (x) (var_ref vec_ctor)
> (expression float d2f
> (expression double ubo_load
On Fri, Mar 25, 2016 at 8:38 PM, Stéphane Marchesin wrote:
> On Wed, Mar 23, 2016 at 5:22 PM, Rob Herring wrote:
>> On Fri, Mar 4, 2016 at 12:07 PM, Rob Clark wrote:
>>> On Fri, Mar 4, 2016 at 12:59 PM, Rob Clark
From: Rob Clark
This will be needed for explicit synchronization with devices outside
the gpu, ie. EGL_ANDROID_native_fence_sync.
Signed-off-by: Rob Clark
---
src/gallium/include/pipe/p_context.h | 6 ++
From: Rob Clark
Since current thing is kinda horrible for tilers. And that issue will
be even worse with EGL_ANDROID_native_fence_sync.
Not wired up yet for gl syncobj, which can come later. For now we just
need this with EGL.
Signed-off-by: Rob Clark
From: Rob Clark
Signed-off-by: Rob Clark
---
src/egl/drivers/dri2/egl_dri2.c | 48 +
src/egl/main/eglapi.c | 36 ---
src/egl/main/eglapi.h | 2 ++
From: Rob Clark
Required to implement EGL_ANDROID_native_fence_sync.
Signed-off-by: Rob Clark
---
include/GL/internal/dri_interface.h | 44 -
1 file changed, 43 insertions(+), 1 deletion(-)
diff --git
From: Rob Clark
Reduce the noise in the next patch. For EGL_SYNC_NATIVE_FENCE_ANDROID
the sync condition is conditional on EGL_SYNC_NATIVE_FENCE_FD_ANDROID
attribute.
Signed-off-by: Rob Clark
---
src/egl/main/eglsync.c | 10 +-
1
From: Rob Clark
This patchset implements support for EGL_ANDROID_native_fence_sync[1]
for egl and gallium. This extension provides support for native fence
fd's (file descriptors) for the GPU. In a similar way to dma-buf fd's,
which provide a reference-counted
Reviewed-by: Ilia Mirkin
On Fri, Apr 1, 2016 at 4:16 PM, Samuel Pitoiset
wrote:
> The grid size is stored as three 32-bits integers in the indirect
> buffer but the launch descriptor uses a 32-bits integer for both
> griddim_y and griddim_z like
The grid size is stored as three 32-bits integers in the indirect
buffer but the launch descriptor uses a 32-bits integer for both
griddim_y and griddim_z like this (z << 16) | y. To make it work,
the 16 high bits of griddim_y are overwritten by griddim_z.
Changes from v4:
- move
On 01/04/16 19:01, Brian Paul wrote:
---
src/gallium/drivers/svga/svga_shader.c | 1 -
src/gallium/drivers/svga/svga_shader.h | 1 -
2 files changed, 2 deletions(-)
diff --git a/src/gallium/drivers/svga/svga_shader.c
b/src/gallium/drivers/svga/svga_shader.c
index 78eb3f6..d56cce4 100644
Unless I'm missing something, this series doesn't contain anything that
uses this patch. Let's drop it for now and put it in with whatever adds
the actual nir_opt_algebraic changes.
Another option would be to silently bail if nir_search tries to create an
expression where the opcode has an
On Thu, Mar 31, 2016 at 3:00 AM, Samuel Iglesias Gonsálvez <
sigles...@igalia.com> wrote:
> From: Connor Abbott
>
> v2: Undo unintended change to the signature of
> nir_normalize_cubemap_coords (Iago).
>
> v3: Move to compiler/nir (Iago)
>
> v4: Remove Authors from
On Thu, Mar 31, 2016 at 2:59 AM, Samuel Iglesias Gonsálvez <
sigles...@igalia.com> wrote:
> From: Connor Abbott
>
> ---
> src/compiler/nir/nir.c | 5 +
> 1 file changed, 5 insertions(+)
>
> diff --git a/src/compiler/nir/nir.c b/src/compiler/nir/nir.c
> index
On Thu, Mar 31, 2016 at 2:59 AM, Samuel Iglesias Gonsálvez <
sigles...@igalia.com> wrote:
> From: Connor Abbott
>
> v2:
> - Squash the printing doubles related patches into one patch (Sam).
> ---
> src/compiler/nir/nir_print.c | 17 ++---
> 1 file changed,
On Thu, Mar 31, 2016 at 2:59 AM, Samuel Iglesias Gonsálvez <
sigles...@igalia.com> wrote:
> From: Iago Toral Quiroga
>
> ---
> src/compiler/nir/nir_lower_load_const_to_scalar.c | 7 +--
> 1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git
On Thu, Mar 31, 2016 at 2:59 AM, Samuel Iglesias Gonsálvez <
sigles...@igalia.com> wrote:
> Signed-off-by: Samuel Iglesias Gonsálvez
> ---
> src/compiler/nir/glsl_to_nir.cpp | 2 +-
> src/compiler/nir/nir.c | 6 --
> src/compiler/nir/nir.h
Quoting Ilia Mirkin (2016-04-01 11:47:49)
> On Fri, Apr 1, 2016 at 2:41 PM, Dylan Baker wrote:
> > Quoting Ilia Mirkin (2016-04-01 08:46:19)
> >
> > Something like this(?):
> >
> > temp = [(f.offset, f) for f in self.functions_by_name.itervalues()
> > if f.offset
On Fri, Apr 1, 2016 at 2:41 PM, Dylan Baker wrote:
> Quoting Ilia Mirkin (2016-04-01 08:46:19)
>> IMHO this is still quite fancy and unnecessarily complicated.
>>
>> How about
>>
>> temp = dict((f.offset, f) for f in self.functions_by_name.itervalues())
>> return
Quoting Ilia Mirkin (2016-04-01 08:46:19)
> IMHO this is still quite fancy and unnecessarily complicated.
>
> How about
>
> temp = dict((f.offset, f) for f in self.functions_by_name.itervalues())
> return (temp[offset] for offset in sorted(temp))
>
> That's all that's happening there,
On Fri, Apr 1, 2016 at 11:24 AM, Rob Clark wrote:
> On Fri, Apr 1, 2016 at 2:04 PM, Jason Ekstrand
> wrote:
> >
> >
> > On Sat, Mar 26, 2016 at 2:02 PM, Rob Clark wrote:
> >>
> >> From: Rob Clark
> >>
>
On Sat, Mar 26, 2016 at 2:02 PM, Rob Clark wrote:
> From: Rob Clark
>
> Signed-off-by: Rob Clark
> ---
> src/mesa/Makefile.sources | 2 +
> src/mesa/state_tracker/st_nir.h | 28 +++
>
On Sat, Mar 26, 2016 at 2:02 PM, Rob Clark wrote:
> From: Rob Clark
>
> We'll need this for a nir pass to lower builtin-uniform access.
>
> Signed-off-by: Rob Clark
> ---
> src/compiler/glsl/builtin_variables.cpp | 24
Rather than the currently bound texture. This goes along with the
earlier patch to get away from examining bound textures and sampler
views during shader translation.
Fixes VMware bug 1632739.
---
src/gallium/drivers/svga/svga_tgsi_vgpu10.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
---
src/gallium/auxiliary/tgsi/tgsi_util.h | 8
1 file changed, 8 insertions(+)
diff --git a/src/gallium/auxiliary/tgsi/tgsi_util.h
b/src/gallium/auxiliary/tgsi/tgsi_util.h
index 3a049ee..ca07bfd 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_util.h
+++
---
src/gallium/drivers/svga/svga_shader.c | 1 -
src/gallium/drivers/svga/svga_shader.h | 1 -
2 files changed, 2 deletions(-)
diff --git a/src/gallium/drivers/svga/svga_shader.c
b/src/gallium/drivers/svga/svga_shader.c
index 78eb3f6..d56cce4 100644
--- a/src/gallium/drivers/svga/svga_shader.c
Assuming we get the previous patch sorted out,
Reviewed-by: Jason Ekstrand
On Sat, Mar 26, 2016 at 2:02 PM, Rob Clark wrote:
> From: Rob Clark
>
> Signed-off-by: Rob Clark
> ---
>
Quoting Michael Schellenberger Costa (2016-04-01 01:30:53)
> Hi,
>
> minor nitpicks wrt ordering below
>
> Am 01.04.2016 um 02:04 schrieb Dylan Baker:
> > Completely clean the imports:
> > - Split so that one module is imported per line
> > - Remove unused imports
> > - Group stdlib imports,
I don't know that I like the lower-io prefix. Maybe nir/io-to-temp?
Doesn't really matter
On Sat, Mar 26, 2016 at 2:02 PM, Rob Clark wrote:
> From: Rob Clark
>
> Prep work to reduce the noise in the next patch.
>
> Signed-off-by: Rob Clark
On Sat, Mar 26, 2016 at 2:02 PM, Rob Clark wrote:
> From: Rob Clark
>
> Since it will gain support to lower inputs, give it a more generic name.
>
> Signed-off-by: Rob Clark
> ---
> src/compiler/Makefile.sources
Reviewed-by: Jason Ekstrand
On Sat, Mar 26, 2016 at 2:02 PM, Rob Clark wrote:
> From: Rob Clark
>
> Going to convert this pass to parameterized lower_io_to_temporaries, and
> we want the user to be able to specify whether to
On Sat, Mar 26, 2016 at 2:02 PM, Rob Clark wrote:
> From: Rob Clark
>
> A pass to lower complex (struct/array/mat) inputs/outputs to primitive
> types. This allows, for example, linking that removes unused components
> of a larger type which is
On 01/04/16 20:28, Ilia Mirkin wrote:
On Fri, Apr 1, 2016 at 1:26 PM, Martin Peres wrote:
On 01/04/16 19:56, Samuel Pitoiset wrote:
Compute support on GK110 is still unstable for weird reasons, but
this can be fixed later as the NVF0_COMPUTE envvar prevent using
compute.
On Fri, Apr 1, 2016 at 12:56 PM, Samuel Pitoiset
wrote:
> Compute support on GK110 is still unstable for weird reasons, but
> this can be fixed later as the NVF0_COMPUTE envvar prevent using
> compute.
>
> Note that GL3.txt is not updated yet because
On Fri, Apr 1, 2016 at 1:26 PM, Martin Peres wrote:
> On 01/04/16 19:56, Samuel Pitoiset wrote:
>>
>> Compute support on GK110 is still unstable for weird reasons, but
>> this can be fixed later as the NVF0_COMPUTE envvar prevent using
>> compute.
>>
>> Note that GL3.txt is
On 01/04/16 19:56, Samuel Pitoiset wrote:
Compute support on GK110 is still unstable for weird reasons, but
this can be fixed later as the NVF0_COMPUTE envvar prevent using
compute.
Note that GL3.txt is not updated yet because GL_ARB_compute_shader is
a bit useless without
Reviewed-by: Ilia Mirkin
On Fri, Apr 1, 2016 at 12:56 PM, Samuel Pitoiset
wrote:
> The maximum number of uniform blocks (MAX_COMPUTE_UNIFORM_BLOCKS)
> per compute program must be at least 12.
>
> Signed-off-by: Samuel Pitoiset
Reviewed-by: Ilia Mirkin
On Fri, Apr 1, 2016 at 12:56 PM, Samuel Pitoiset
wrote:
> For Maxwell, the ATOMS instruction can be used to perform atomic
> operations on shared memory instead of this load/store lowering pass.
>
> Changes from v2:
> -
On Fri, Apr 1, 2016 at 12:56 PM, Samuel Pitoiset
wrote:
> Signed-off-by: Samuel Pitoiset
> ---
> .../nouveau/codegen/nv50_ir_lowering_nvc0.cpp | 114
> -
> .../nouveau/codegen/nv50_ir_lowering_nvc0.h| 1 +
Reviewed-by: Ilia Mirkin
On Fri, Apr 1, 2016 at 12:56 PM, Samuel Pitoiset
wrote:
> This fixes 84b9b8f (nvc0/ir: add missing emission of locked load
> predicate).
>
> Signed-off-by: Samuel Pitoiset
> ---
>
On Fri, Apr 1, 2016 at 12:56 PM, Samuel Pitoiset
wrote:
> Make sure to avoid out of bounds access in presence of indirect
> array indexing by loading the size from the driver constant buffer.
>
> Changes from v2:
> - add a todo for clamping the offset to the max
On Fri, Apr 1, 2016 at 12:56 PM, Samuel Pitoiset
wrote:
> The grid size is stored as three 32-bits integers in the indirect
> buffer but the launch descriptor uses a 32-bits integer for both
> griddim_y and griddim_z like this (z << 16) | y. To make it work,
> the 16
Reviewed-by: Ilia Mirkin
On Fri, Apr 1, 2016 at 12:56 PM, Samuel Pitoiset
wrote:
> Reduce likelihood of collision with real buffers by placing the
> hole at the top of the 4G area. This fixes some indirect draw+compute
> tests with large buffers.
Reviewed-by: Ilia Mirkin
On Fri, Apr 1, 2016 at 12:56 PM, Samuel Pitoiset
wrote:
> Signed-off-by: Samuel Pitoiset
> ---
> .../drivers/nouveau/codegen/nv50_ir_driver.h | 1 +
>
On Fri, Apr 1, 2016 at 12:55 PM, Samuel Pitoiset
wrote:
> Uniform buffer objects will be sticked to the driver constant buffer
> like buffers because the launch descriptor only allows 8 CBs.
>
> Input kernel parameters for OpenCL are still uploaded to screen->parm
>
Reviewed-by: Ilia Mirkin
On Fri, Apr 1, 2016 at 12:55 PM, Samuel Pitoiset
wrote:
> Signed-off-by: Samuel Pitoiset
> ---
> src/gallium/drivers/nouveau/nvc0/nvc0_program.c | 4 +--
>
On Fri, Apr 1, 2016 at 12:55 PM, Samuel Pitoiset
wrote:
> Instead of using the screen->parm buffer object which will be removed,
> upload auxiliary constants to uniform_bo to be consistent regarding
> what we already do for Fermi.
>
> This breaks surfaces support (for
1 - 100 of 142 matches
Mail list logo