Re: [Mesa-dev] [PATCH] anv: destroy descriptor sets when pool gets reset
Makes sense, sorry I missed this one; Reviewed-by: Tapani Pälli On 3/11/19 7:33 PM, Juan A. Suarez Romero wrote: As stated in Vulkan spec: "Resetting a descriptor pool recycles all of the resources from all of the descriptor sets allocated from the descriptor pool back to the descriptor pool, and the descriptor sets are implicitly freed." This fixes dEQP-VK.api.descriptor_pool.* Fixes: 14f6275c92f1 ("anv/descriptor_set: add reference counting for descriptor set layouts") CC: Tapani Pälli CC: Lionel Landwerlin CC: Jason Ekstrand --- src/intel/vulkan/anv_descriptor_set.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/src/intel/vulkan/anv_descriptor_set.c b/src/intel/vulkan/anv_descriptor_set.c index f293cf469ee..f34a44aefd7 100644 --- a/src/intel/vulkan/anv_descriptor_set.c +++ b/src/intel/vulkan/anv_descriptor_set.c @@ -636,6 +636,12 @@ VkResult anv_ResetDescriptorPool( } anv_state_stream_finish(>surface_state_stream); + + list_for_each_entry_safe(struct anv_descriptor_set, set, +>desc_sets, pool_link) { + anv_descriptor_set_destroy(device, pool, set); + } + anv_state_stream_init(>surface_state_stream, >surface_state_pool, 4096); pool->surface_state_free_list = NULL; ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] nir: Remove unused variable unroll_loc
> The first patch was fine. Just not the second. First patch is a duplicate of the linked patch. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] nir: Remove unused variable unroll_loc
On March 11, 2019 22:17:38 Alyssa Rosenzweig wrote: [1] https://patchwork.freedesktop.org/patch/291616/ Ah-ha, somebody who knows what they're doing. That's good; ignore this series then :) The first patch was fine. Just not the second. --- So is this a "no" for the deleting ~/mesa/src idea? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] nir: Remove unused variable unroll_loc
> [1] https://patchwork.freedesktop.org/patch/291616/ Ah-ha, somebody who knows what they're doing. That's good; ignore this series then :) --- So is this a "no" for the deleting ~/mesa/src idea? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] nir: Remove unused variable unroll_loc
On 12/3/19 2:09 pm, Alyssa Rosenzweig wrote: A better fix might be to delete the Mesa src tree, that should fix up any warnings :P Huh, so it does! No regressions on dEQP, possibly since all the tests were failing to begin with on Panfrost. I can only assume this was sent out by mistake? This was sent out by "I don't know what I'm doing but there was a warning so I thought I'd nudge someone who knows". :) I guess really the RHS should still be evaluated and just not stored anywhere? The value is evaluated but only in debug builds. The correct fix is to use MAYBE_UNUSED see [1] [1] https://patchwork.freedesktop.org/patch/291616/ ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] nir: Remove unused variable unroll_loc
> A better fix might be to delete the Mesa src tree, that should fix up any > warnings :P Huh, so it does! No regressions on dEQP, possibly since all the tests were failing to begin with on Panfrost. > I can only assume this was sent out by mistake? This was sent out by "I don't know what I'm doing but there was a warning so I thought I'd nudge someone who knows". I guess really the RHS should still be evaluated and just not stored anywhere? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] nir: silence a couple new compiler warnings
On 12/3/19 1:50 pm, Brian Paul wrote: Thanks, but it looks like Alyssa's patches may be better. -Brian The second patch looks really wrong. Seems like a mistake to me. On 03/11/2019 08:33 PM, Timothy Arceri wrote: Thanks Reviewed-by: Timothy Arceri On 12/3/19 1:12 pm, Brian Paul wrote: [33/630] Compiling C object 'src/compiler/nir/nir@sta/nir_loop_analyze.c.o'. ../src/compiler/nir/nir_loop_analyze.c: In function ‘try_find_trip_count_vars_in_iand’: ../src/compiler/nir/nir_loop_analyze.c:846:29: warning: suggest parentheses around ‘&&’ within ‘||’ [-Wparentheses] if (*ind == NULL || *ind && (*ind)->type != basic_induction || ^ [85/630] Compiling C object 'src/compiler/nir/nir@sta/nir_opt_loop_unroll.c.o'. ../src/compiler/nir/nir_opt_loop_unroll.c: In function ‘complex_unroll_single_terminator’: ../src/compiler/nir/nir_opt_loop_unroll.c:494:17: warning: unused variable ‘unroll_loc’ [-Wunused-variable] nir_cf_node *unroll_loc = ^ --- src/compiler/nir/nir_loop_analyze.c | 2 +- src/compiler/nir/nir_opt_loop_unroll.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/src/compiler/nir/nir_loop_analyze.c b/src/compiler/nir/nir_loop_analyze.c index bc116f4..c830461 100644 --- a/src/compiler/nir/nir_loop_analyze.c +++ b/src/compiler/nir/nir_loop_analyze.c @@ -843,7 +843,7 @@ try_find_trip_count_vars_in_iand(nir_alu_instr **alu, } /* Try the other iand src if needed */ - if (*ind == NULL || *ind && (*ind)->type != basic_induction || + if (*ind == NULL || (*ind && (*ind)->type != basic_induction) || !is_var_constant(*limit)) { src = iand->src[1].src.ssa; if (src->parent_instr->type == nir_instr_type_alu) { diff --git a/src/compiler/nir/nir_opt_loop_unroll.c b/src/compiler/nir/nir_opt_loop_unroll.c index 9ab0a92..06ec78b 100644 --- a/src/compiler/nir/nir_opt_loop_unroll.c +++ b/src/compiler/nir/nir_opt_loop_unroll.c @@ -491,7 +491,7 @@ complex_unroll_single_terminator(nir_loop *loop) unsigned num_times_to_clone = loop->info->max_trip_count + 1; nir_cf_list lp_body; - nir_cf_node *unroll_loc = + MAYBE_UNUSED nir_cf_node *unroll_loc = complex_unroll_loop_body(loop, terminator, _header, _body, remap_table, num_times_to_clone); ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] nir: Remove unused variable unroll_loc
On 12/3/19 1:30 pm, Alyssa Rosenzweig wrote: Fixes a gcc warning. Signed-off-by: Alyssa Rosenzweig --- src/compiler/nir/nir_opt_loop_unroll.c | 4 1 file changed, 4 deletions(-) diff --git a/src/compiler/nir/nir_opt_loop_unroll.c b/src/compiler/nir/nir_opt_loop_unroll.c index 9ab0a924c82..41f7a834164 100644 --- a/src/compiler/nir/nir_opt_loop_unroll.c +++ b/src/compiler/nir/nir_opt_loop_unroll.c @@ -488,12 +488,8 @@ complex_unroll_single_terminator(nir_loop *loop) * vars for the last iteration (they are inside the following ifs break * branch). We leave other passes to clean up this redundant if. */ - unsigned num_times_to_clone = loop->info->max_trip_count + 1; nir_cf_list lp_body; - nir_cf_node *unroll_loc = - complex_unroll_loop_body(loop, terminator, _header, _body, - remap_table, num_times_to_clone); A better fix might be to delete the Mesa src tree, that should fix up any warnings :P I can only assume this was sent out by mistake? /* Delete the original loop header and body */ nir_cf_delete(_header); ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] nir: silence a couple new compiler warnings
Thanks, but it looks like Alyssa's patches may be better. -Brian On 03/11/2019 08:33 PM, Timothy Arceri wrote: Thanks Reviewed-by: Timothy Arceri On 12/3/19 1:12 pm, Brian Paul wrote: [33/630] Compiling C object 'src/compiler/nir/nir@sta/nir_loop_analyze.c.o'. ../src/compiler/nir/nir_loop_analyze.c: In function ‘try_find_trip_count_vars_in_iand’: ../src/compiler/nir/nir_loop_analyze.c:846:29: warning: suggest parentheses around ‘&&’ within ‘||’ [-Wparentheses] if (*ind == NULL || *ind && (*ind)->type != basic_induction || ^ [85/630] Compiling C object 'src/compiler/nir/nir@sta/nir_opt_loop_unroll.c.o'. ../src/compiler/nir/nir_opt_loop_unroll.c: In function ‘complex_unroll_single_terminator’: ../src/compiler/nir/nir_opt_loop_unroll.c:494:17: warning: unused variable ‘unroll_loc’ [-Wunused-variable] nir_cf_node *unroll_loc = ^ --- src/compiler/nir/nir_loop_analyze.c | 2 +- src/compiler/nir/nir_opt_loop_unroll.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/src/compiler/nir/nir_loop_analyze.c b/src/compiler/nir/nir_loop_analyze.c index bc116f4..c830461 100644 --- a/src/compiler/nir/nir_loop_analyze.c +++ b/src/compiler/nir/nir_loop_analyze.c @@ -843,7 +843,7 @@ try_find_trip_count_vars_in_iand(nir_alu_instr **alu, } /* Try the other iand src if needed */ - if (*ind == NULL || *ind && (*ind)->type != basic_induction || + if (*ind == NULL || (*ind && (*ind)->type != basic_induction) || !is_var_constant(*limit)) { src = iand->src[1].src.ssa; if (src->parent_instr->type == nir_instr_type_alu) { diff --git a/src/compiler/nir/nir_opt_loop_unroll.c b/src/compiler/nir/nir_opt_loop_unroll.c index 9ab0a92..06ec78b 100644 --- a/src/compiler/nir/nir_opt_loop_unroll.c +++ b/src/compiler/nir/nir_opt_loop_unroll.c @@ -491,7 +491,7 @@ complex_unroll_single_terminator(nir_loop *loop) unsigned num_times_to_clone = loop->info->max_trip_count + 1; nir_cf_list lp_body; - nir_cf_node *unroll_loc = + MAYBE_UNUSED nir_cf_node *unroll_loc = complex_unroll_loop_body(loop, terminator, _header, _body, remap_table, num_times_to_clone); ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] nir: Add missing parentheses
This patch is Reviewed-by: Jason Ekstrand On March 11, 2019 21:31:03 Alyssa Rosenzweig wrote: Fixes a gcc warning (and a theoretical NULL dereference error, though I suppose shortcircuiting avoids that). Signed-off-by: Alyssa Rosenzweig --- src/compiler/nir/nir_loop_analyze.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/compiler/nir/nir_loop_analyze.c b/src/compiler/nir/nir_loop_analyze.c index bc116f4d1d7..c8304611b28 100644 --- a/src/compiler/nir/nir_loop_analyze.c +++ b/src/compiler/nir/nir_loop_analyze.c @@ -843,7 +843,7 @@ try_find_trip_count_vars_in_iand(nir_alu_instr **alu, } /* Try the other iand src if needed */ - if (*ind == NULL || *ind && (*ind)->type != basic_induction || + if (*ind == NULL || (*ind && (*ind)->type != basic_induction) || !is_var_constant(*limit)) { src = iand->src[1].src.ssa; if (src->parent_instr->type == nir_instr_type_alu) { -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] nir: silence a couple new compiler warnings
Thanks Reviewed-by: Timothy Arceri On 12/3/19 1:12 pm, Brian Paul wrote: [33/630] Compiling C object 'src/compiler/nir/nir@sta/nir_loop_analyze.c.o'. ../src/compiler/nir/nir_loop_analyze.c: In function ‘try_find_trip_count_vars_in_iand’: ../src/compiler/nir/nir_loop_analyze.c:846:29: warning: suggest parentheses around ‘&&’ within ‘||’ [-Wparentheses] if (*ind == NULL || *ind && (*ind)->type != basic_induction || ^ [85/630] Compiling C object 'src/compiler/nir/nir@sta/nir_opt_loop_unroll.c.o'. ../src/compiler/nir/nir_opt_loop_unroll.c: In function ‘complex_unroll_single_terminator’: ../src/compiler/nir/nir_opt_loop_unroll.c:494:17: warning: unused variable ‘unroll_loc’ [-Wunused-variable] nir_cf_node *unroll_loc = ^ --- src/compiler/nir/nir_loop_analyze.c| 2 +- src/compiler/nir/nir_opt_loop_unroll.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/src/compiler/nir/nir_loop_analyze.c b/src/compiler/nir/nir_loop_analyze.c index bc116f4..c830461 100644 --- a/src/compiler/nir/nir_loop_analyze.c +++ b/src/compiler/nir/nir_loop_analyze.c @@ -843,7 +843,7 @@ try_find_trip_count_vars_in_iand(nir_alu_instr **alu, } /* Try the other iand src if needed */ - if (*ind == NULL || *ind && (*ind)->type != basic_induction || + if (*ind == NULL || (*ind && (*ind)->type != basic_induction) || !is_var_constant(*limit)) { src = iand->src[1].src.ssa; if (src->parent_instr->type == nir_instr_type_alu) { diff --git a/src/compiler/nir/nir_opt_loop_unroll.c b/src/compiler/nir/nir_opt_loop_unroll.c index 9ab0a92..06ec78b 100644 --- a/src/compiler/nir/nir_opt_loop_unroll.c +++ b/src/compiler/nir/nir_opt_loop_unroll.c @@ -491,7 +491,7 @@ complex_unroll_single_terminator(nir_loop *loop) unsigned num_times_to_clone = loop->info->max_trip_count + 1; nir_cf_list lp_body; - nir_cf_node *unroll_loc = + MAYBE_UNUSED nir_cf_node *unroll_loc = complex_unroll_loop_body(loop, terminator, _header, _body, remap_table, num_times_to_clone); ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] nir: Remove unused variable unroll_loc
Fixes a gcc warning. Signed-off-by: Alyssa Rosenzweig --- src/compiler/nir/nir_opt_loop_unroll.c | 4 1 file changed, 4 deletions(-) diff --git a/src/compiler/nir/nir_opt_loop_unroll.c b/src/compiler/nir/nir_opt_loop_unroll.c index 9ab0a924c82..41f7a834164 100644 --- a/src/compiler/nir/nir_opt_loop_unroll.c +++ b/src/compiler/nir/nir_opt_loop_unroll.c @@ -488,12 +488,8 @@ complex_unroll_single_terminator(nir_loop *loop) * vars for the last iteration (they are inside the following ifs break * branch). We leave other passes to clean up this redundant if. */ - unsigned num_times_to_clone = loop->info->max_trip_count + 1; nir_cf_list lp_body; - nir_cf_node *unroll_loc = - complex_unroll_loop_body(loop, terminator, _header, _body, - remap_table, num_times_to_clone); /* Delete the original loop header and body */ nir_cf_delete(_header); -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] nir: Add missing parentheses
Fixes a gcc warning (and a theoretical NULL dereference error, though I suppose shortcircuiting avoids that). Signed-off-by: Alyssa Rosenzweig --- src/compiler/nir/nir_loop_analyze.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/compiler/nir/nir_loop_analyze.c b/src/compiler/nir/nir_loop_analyze.c index bc116f4d1d7..c8304611b28 100644 --- a/src/compiler/nir/nir_loop_analyze.c +++ b/src/compiler/nir/nir_loop_analyze.c @@ -843,7 +843,7 @@ try_find_trip_count_vars_in_iand(nir_alu_instr **alu, } /* Try the other iand src if needed */ - if (*ind == NULL || *ind && (*ind)->type != basic_induction || + if (*ind == NULL || (*ind && (*ind)->type != basic_induction) || !is_var_constant(*limit)) { src = iand->src[1].src.ssa; if (src->parent_instr->type == nir_instr_type_alu) { -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] nir: silence a couple new compiler warnings
[33/630] Compiling C object 'src/compiler/nir/nir@sta/nir_loop_analyze.c.o'. ../src/compiler/nir/nir_loop_analyze.c: In function ‘try_find_trip_count_vars_in_iand’: ../src/compiler/nir/nir_loop_analyze.c:846:29: warning: suggest parentheses around ‘&&’ within ‘||’ [-Wparentheses] if (*ind == NULL || *ind && (*ind)->type != basic_induction || ^ [85/630] Compiling C object 'src/compiler/nir/nir@sta/nir_opt_loop_unroll.c.o'. ../src/compiler/nir/nir_opt_loop_unroll.c: In function ‘complex_unroll_single_terminator’: ../src/compiler/nir/nir_opt_loop_unroll.c:494:17: warning: unused variable ‘unroll_loc’ [-Wunused-variable] nir_cf_node *unroll_loc = ^ --- src/compiler/nir/nir_loop_analyze.c| 2 +- src/compiler/nir/nir_opt_loop_unroll.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/src/compiler/nir/nir_loop_analyze.c b/src/compiler/nir/nir_loop_analyze.c index bc116f4..c830461 100644 --- a/src/compiler/nir/nir_loop_analyze.c +++ b/src/compiler/nir/nir_loop_analyze.c @@ -843,7 +843,7 @@ try_find_trip_count_vars_in_iand(nir_alu_instr **alu, } /* Try the other iand src if needed */ - if (*ind == NULL || *ind && (*ind)->type != basic_induction || + if (*ind == NULL || (*ind && (*ind)->type != basic_induction) || !is_var_constant(*limit)) { src = iand->src[1].src.ssa; if (src->parent_instr->type == nir_instr_type_alu) { diff --git a/src/compiler/nir/nir_opt_loop_unroll.c b/src/compiler/nir/nir_opt_loop_unroll.c index 9ab0a92..06ec78b 100644 --- a/src/compiler/nir/nir_opt_loop_unroll.c +++ b/src/compiler/nir/nir_opt_loop_unroll.c @@ -491,7 +491,7 @@ complex_unroll_single_terminator(nir_loop *loop) unsigned num_times_to_clone = loop->info->max_trip_count + 1; nir_cf_list lp_body; - nir_cf_node *unroll_loc = + MAYBE_UNUSED nir_cf_node *unroll_loc = complex_unroll_loop_body(loop, terminator, _header, _body, remap_table, num_times_to_clone); -- 1.8.5.6 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] docs: link to the meson_options.txt file gitlab.freedesktop.org
--- docs/meson.html | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/docs/meson.html b/docs/meson.html index 09b45be..08e4d1a 100644 --- a/docs/meson.html +++ b/docs/meson.html @@ -103,7 +103,8 @@ running "meson build/" but this feature is being discussed upstream. For now, we have a bin/meson-options.py script that prints the options for you. If that script doesn't work for some reason, you can always look in the -meson_options.txt file at the root of the project. +https://gitlab.freedesktop.org/mesa/mesa/blob/master/meson_options.txt;> +meson_options.txt file at the root of the project. -- 1.8.5.6 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] docs: separate information for compiler selection and compiler options
Split up the "Environment Variables" section into "Compiler Options" and "Compiler Specification". I think this makes the information easier to find and understand. --- docs/meson.html | 58 ++--- 1 file changed, 39 insertions(+), 19 deletions(-) diff --git a/docs/meson.html b/docs/meson.html index 7ffef81..09b45be 100644 --- a/docs/meson.html +++ b/docs/meson.html @@ -169,47 +169,67 @@ Developers will often want to install Mesa to a testing directory rather than the system library directory. This can be done with the --prefix option. For example: - + meson --prefix="${PWD}/build/install" build/ - + will put the final libraries and drivers into the build/install/ directory. Then you can set LD_LIBRARY_PATH and LIBGL_DRIVERS_PATH to that location to run/test the driver. + +Meson also honors DESTDIR for installs. + -Environment Variables -Meson supports the standard CC and CXX environment variables for -changing the default compiler. Meson does support CFLAGS, CXXFLAGS, etc. But -their use is discouraged because of the many caveats in using them. Instead it -is recomended to use -D${lang}_args and --D${lang}_link_args instead. Among the benefits of these options +Compiler Options + +Meson supports the common CFLAGS, CXXFLAGS, etc. environment +variables but their use is discouraged because of the many caveats +in using them. + +Instead, it is recomended to use -D${lang}_args and +-D${lang}_link_args. Among the benefits of these options is that they are guaranteed to persist across rebuilds and reconfigurations. + +This example sets -fmax-errors for compiling C sources and -DMAGIC=123 +for C++ sources: + + + +meson builddir/ -Dc_args=-fmax-errors=10 -Dcpp_args=-DMAGIC=123 + + + + -Meson does not allow changing compiler in a configured builddir, you will need +Compiler Specification + + +Meson supports the standard CC and CXX environment variables for +changing the default compiler. Note that Meson does not allow +changing the compilers in a configured builddir so you will need to create a new build dir for a different compiler. - + +This is an example of specifying the clang compilers and cleaning +the build directory before reconfiguring with an extra C option: + -CC=clang CXX=clang++ meson build-clang -ninja -C build-clang -ninja -C build-clang clean -meson configure build -Dc_args="-Wno-typedef-redefinition" -ninja -C build-clang +CC=clang CXX=clang++ meson build-clang +ninja -C build-clang +ninja -C build-clang clean +meson configure build -Dc_args="-Wno-typedef-redefinition" +ninja -C build-clang - The default compilers depends on your operating system. Meson supports most of the popular compilers, a complete list is available http://mesonbuild.com/Reference-tables.html#compiler-ids;>here. - -Meson also honors DESTDIR for installs - LLVM Meson includes upstream logic to wrap llvm-config using its standard dependency interface. -- 1.8.5.6 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] anv: destroy descriptor sets when pool gets reset
pushed. On Mon, Mar 11, 2019 at 3:15 PM Clayton Craft wrote: > On Mon, Mar 11, 2019 at 06:33:54PM +0100, Juan A. Suarez Romero wrote: > >As stated in Vulkan spec: > > "Resetting a descriptor pool recycles all of the resources from all > >of the descriptor sets allocated from the descriptor pool back to > >the descriptor pool, and the descriptor sets are implicitly freed." > > > >This fixes dEQP-VK.api.descriptor_pool.* > > > >Fixes: 14f6275c92f1 ("anv/descriptor_set: add reference counting for > descriptor set layouts") > > I ran this though CI and these tests are no longer failing. I didn't see > any > regressions either. > > > >CC: Tapani Pälli > >CC: Lionel Landwerlin > >CC: Jason Ekstrand > >--- > > src/intel/vulkan/anv_descriptor_set.c | 6 ++ > > 1 file changed, 6 insertions(+) > > > >diff --git a/src/intel/vulkan/anv_descriptor_set.c > b/src/intel/vulkan/anv_descriptor_set.c > >index f293cf469ee..f34a44aefd7 100644 > >--- a/src/intel/vulkan/anv_descriptor_set.c > >+++ b/src/intel/vulkan/anv_descriptor_set.c > >@@ -636,6 +636,12 @@ VkResult anv_ResetDescriptorPool( > >} > > > >anv_state_stream_finish(>surface_state_stream); > >+ > >+ list_for_each_entry_safe(struct anv_descriptor_set, set, > >+>desc_sets, pool_link) { > >+ anv_descriptor_set_destroy(device, pool, set); > >+ } > >+ > >anv_state_stream_init(>surface_state_stream, > > >surface_state_pool, 4096); > >pool->surface_state_free_list = NULL; > >-- > >2.20.1 > > > >___ > >mesa-dev mailing list > >mesa-dev@lists.freedesktop.org > >https://lists.freedesktop.org/mailman/listinfo/mesa-dev > -BEGIN PGP SIGNATURE- > > iQIzBAABCAAdFiEEQ9xksAUlQz+rdXJEYJ7sVybntyAFAlyGwb0ACgkQYJ7sVybn > tyBuBRAAiuXaaivM1CveLa7svwptOOV41hOmvtWWE75ziY0gMPd/p8Ks1sGkN7um > BYfoAJUnJeMPKR5AblXX/IcmWL5yG/s8vnQu4DTASzDJnwJlcp7zN3qdBMUcMQuY > 2LD1UhjuWPNUF4MJFAHqQj7t6vBCM1CtqayhjNCghMRqaQRj3GIee+BDfwm1bUzZ > 1NYF7W83Gd7rK7yzj0Efbx8C4U0yO/PQYq3ddZlCACD/xLEoUgyjf4IwcOIexBtM > 0kCcb6ucrqhgLBvZhJRewGvuH2+DBIDzvaZ/AhQSKCLCr/O8HqAEY57TPOIlzNbI > KgKbEqgJY83uuLqx60iW4bfs+ZtYLlh9HFsJKKrQvmXGbw9BlItPBkFsvFJ7kimo > G6I7q4WAv3gBquTx6+hB4BmINwlHeEKqQU/U01y+95OAzjIg/8IlSB/QfvDDwtYC > GPm8S+X6MTFUiYfHeatO9loIvAV2libS5LYGozUsYWAhcS6NrjIWQ3rHovcumFG2 > vhJxOn51Krwp+lPYX8D7ysJK8QGfs+v579TQ75cDdH5lb8xpyBI0cJBFA+Ov/LVI > tyb9xDIsvX9yTrdZo5Bx9ujRdlaU0KkL/BbWlgUu5J3qhlQsk/e6lV+csNt2OvUv > e4bwsxht2G1RrtQ+v7miBFO/FVuv5GvAtxr546b2x73GranmRUU= > =fl2b > -END PGP SIGNATURE- > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] st/mesa: init hash keys with memset(), not designated initializers
On 03/11/2019 04:47 PM, Eric Anholt wrote: Brian Paul writes: Since the compiler may not zero-out padding in the object. Add a couple comments about this to prevent misunderstandings in the future. Fixes: 67d96816ff5 ("st/mesa: move, clean-up shader variant key decls/inits") --- src/mesa/state_tracker/st_atom_shader.c | 9 +++-- src/mesa/state_tracker/st_program.c | 13 ++--- 2 files changed, 17 insertions(+), 5 deletions(-) diff --git a/src/mesa/state_tracker/st_atom_shader.c b/src/mesa/state_tracker/st_atom_shader.c index ac7a1a5..a4475e2 100644 --- a/src/mesa/state_tracker/st_atom_shader.c +++ b/src/mesa/state_tracker/st_atom_shader.c @@ -112,7 +112,10 @@ st_update_fp( struct st_context *st ) !stfp->variants->key.bitmap) { shader = stfp->variants->driver_shader; } else { - struct st_fp_variant_key key = {0}; + struct st_fp_variant_key key; + + /* use memset, not an initializer to be sure all memory is zeroed */ + memset(, 0, sizeof(key)); Wait, what? We rely on this form of initialization all over, what's changed? The question is do all compilers, when presented with struct st_fp_variant_key key = {0}; initialize the entire object to zero, or just the individual fields (skipping padding). This matters for hash keys but shouldn't matter otherwise. -Brian ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 109641] GLX swrast driver leaks shared memory segments
https://bugs.freedesktop.org/show_bug.cgi?id=109641 --- Comment #1 from Brian Paul --- I believe this recent patch should help: commit b344e32cdf7064a1f2ff7ef37027edda6589404f Author: Ray Zhang Date: Wed Feb 27 06:54:05 2019 + glx: fix shared memory leak in X11 call XShmDetach to allow X server to free shared memory Fixes: bcd80be49a8260c2233d "drisw/glx: use XShm if possible" Signed-off-by: Ray Zhang Reviewed-by: Dave Airlie Can you try top-of-tree Mesa? -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 33/34] nv50/ir/nir: handle user clip planes for each emitted vertex
On Mon, Mar 11, 2019 at 8:18 PM Karol Herbst wrote: > > On Tue, Mar 12, 2019 at 1:09 AM Ilia Mirkin wrote: > > > > On Mon, Mar 11, 2019 at 8:05 PM Karol Herbst wrote: > > > > > > v9: convert to C++ style comments > > > Signed-off-by: Karol Herbst > > > --- > > > src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp | 6 +- > > > 1 file changed, 5 insertions(+), 1 deletion(-) > > > > > > diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp > > > b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp > > > index 627848a457f..fdc6eaf759a 100644 > > > --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp > > > +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp > > > @@ -1561,7 +1561,7 @@ Converter::visit(nir_function *function) > > > bb->cfg.attach(>cfg, Graph::Edge::TREE); > > > setPosition(exit, true); > > > > > > - if (info->io.genUserClip > 0) > > > + if (prog->getType() == Program::TYPE_VERTEX && info->io.genUserClip > > > > 0) > > > > What about TES? Did you mean && !TYPE_GEOMETRY perhaps? > > > > yeah, that's missing. Thanks for pointing it out! Apparently we have > no piglit test testing that. Looks like we have them for geom shaders (tests/spec/glsl-1.50/execution/compatibility/clipping) but not for tess (the other compat tests are in tests/spec/arb_tessellation_shader/execution/compatibility). Shouldn't be difficult to add _something_, although extensive ones will be ... confusing as always. -ilia ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] 2019 X.Org Foundation Election Candidates
To all X.Org Foundation Members: The election for the X.Org Foundation Board of Directors will begin on 14 March 2019. We have 6 candidates who are running for 4 seats. They are (in alphabetical order): Samuel Iglesias Gonsálvez Arkadiusz Hiler Manasi Navare Lyude Paul Daniel Vetter Trevor Woerner Attached below are the Personal Statements each candidate submitted for your consideration along with their Statements of Contribution that they submitted with the membership application. Please review each of the candidates' statements to help you decide whom to vote for during the upcoming election. If you have questions of the candidates, you should feel free to ask them here on the mailing list. The election committee will provide detailed instructions on how the voting system will work when the voting period begins. Harry, on behalf of the X.Org elections committee # Nominees ## Samuel Iglesias Gonsálvez __Current Affiliation:__ Igalia __Personal Statement:__ I have been contributing to Mesa for 5 years, specifically to the Intel drivers for OpenGL and Vulkan. I was one of the organizers of XDC 2018 in A Coruña, Spain. If I am elected, I will use my experience on XDC 2018 to improve the organization of future XDC, help to spread X.Org technologies and help X.org project as much as possible. ## Arkadiusz Hiler __Current Affiliation:__ Intel __Personal Statement:__ My main interest is in quality and automated testing throughout the graphics stack. Especially helping the areas that are lagging behind: testing the kernel and KMS plumbing. I would like to focus on building open source testing toolbox so that anyone can mix and match bits of it to build their testing infrastructure. There are a lot of exotic setups and edge cases that are hard to exercise locally by developers due to lack of hardware. Automating those cases by software faking it or by granting access to the testing infrastructures benefits the whole community. It also a good way of lowering the bar for new contributors and enables refactoring with confidence. ## Manasi Navare __Current Affiliation:__ Intel __Statement of Contribution:__ I am a lead contributor to Intel's Open source graphics kernel driver i915 as well as to the Linux Kernel DRM subsystem. One of my most widely used contributions is the Display Port Compliance code in i915, DRM as well as in Xserver and IGT to make the entire graphics stack Display Port compliant and reward the end users with black screen free displays. Most recently I have been involved in upstreaming Display Stream Compression feature across DRM i915 to enable high resolutions like 5K@120. I also have commit rights to several upstream projects like drm-intel, drm-misc and Intel GPU Tools. __Personal Statement:__ I have been Linux Open Source contributor for last 4 years since I joined Intel's Open source technology center. I have presented several talks at Linux Graphics conferences like Embedded Linux Conference, XDC and FOSDEM on several graphics display features like Display Port compliance and Display Stream Compression. I have been already actively involved in IRC discussions with DRM and i915 maintainers to constantly provide any solution on display port questions and work on improving the kernel documentation and code quality. I have proactively started attending X.org board meetings on IRC to better understand working of X.org and at XDC 2018, I also volunteered in the Code of Conduct committee during the conference and I was the point of contact for this. I am also currently a mentor for the KMS project in Outreachy winter program and committed to mentor the Google summer of code program as well. If I get elected, I would like to contribute by helping organize the X.org foundation conferences, screening the papers and any help needed in terms of public relations, working with the sponsors or code of conduct on the day of the conference to make the events a huge success. I would also like to leverage my open source working knowledge on any technical help required for the X.org events. ## Lyude Paul __Current Affiliation:__ Red Hat __Personal Statement:__ One of the people who helped start Panfrost! Also a contributor to nouveau, i915, amdgpu, radeon, weston, Xorg, multiple X DDXs, libinput, the wayland protocol, various other non-graphics related bits in the kernel, and probably more! __Statement of Contribution:__ I originally found out about Linux through a rather unexpected place: an Ubuntu booth at an Anime convention. I was in awe of the beauty of the all-mighty Compiz workspace switching cube and ended up deciding to give it a shot on my own computer. I ended up loving Linux, and quickly found I couldn't go back to other operating systems. I also wanted to become involved, but didn't really know how to at first. After years of being a user throughout high school and the start of my college career, I ended up taking on the challenge of trying
Re: [Mesa-dev] [PATCH 33/34] nv50/ir/nir: handle user clip planes for each emitted vertex
On Tue, Mar 12, 2019 at 1:09 AM Ilia Mirkin wrote: > > On Mon, Mar 11, 2019 at 8:05 PM Karol Herbst wrote: > > > > v9: convert to C++ style comments > > Signed-off-by: Karol Herbst > > --- > > src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp | 6 +- > > 1 file changed, 5 insertions(+), 1 deletion(-) > > > > diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp > > b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp > > index 627848a457f..fdc6eaf759a 100644 > > --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp > > +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp > > @@ -1561,7 +1561,7 @@ Converter::visit(nir_function *function) > > bb->cfg.attach(>cfg, Graph::Edge::TREE); > > setPosition(exit, true); > > > > - if (info->io.genUserClip > 0) > > + if (prog->getType() == Program::TYPE_VERTEX && info->io.genUserClip > 0) > > What about TES? Did you mean && !TYPE_GEOMETRY perhaps? > yeah, that's missing. Thanks for pointing it out! Apparently we have no piglit test testing that. > >handleUserClipPlanes(); > > > > // TODO: for non main function this needs to be a OP_RETURN > > @@ -1889,6 +1889,7 @@ Converter::visit(nir_intrinsic_instr *insn) > > } > > break; > > } > > + case Program::TYPE_GEOMETRY: > > case Program::TYPE_VERTEX: { > > if (info->io.genUserClip > 0 && idx == clipVertexOutput) { > > mkMov(clipVtx[i], src); > > @@ -2187,6 +2188,9 @@ Converter::visit(nir_intrinsic_instr *insn) > >break; > > } > > case nir_intrinsic_emit_vertex: > > + if (info->io.genUserClip > 0) > > + handleUserClipPlanes(); > > + // fallthrough > > case nir_intrinsic_end_primitive: { > >uint32_t idx = nir_intrinsic_stream_id(insn); > >mkOp1(getOperation(op), TYPE_U32, NULL, mkImm(idx))->fixed = 1; > > -- > > 2.20.1 > > > > ___ > > mesa-dev mailing list > > mesa-dev@lists.freedesktop.org > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 33/34] nv50/ir/nir: handle user clip planes for each emitted vertex
On Mon, Mar 11, 2019 at 8:05 PM Karol Herbst wrote: > > v9: convert to C++ style comments > Signed-off-by: Karol Herbst > --- > src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp | 6 +- > 1 file changed, 5 insertions(+), 1 deletion(-) > > diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp > b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp > index 627848a457f..fdc6eaf759a 100644 > --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp > +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp > @@ -1561,7 +1561,7 @@ Converter::visit(nir_function *function) > bb->cfg.attach(>cfg, Graph::Edge::TREE); > setPosition(exit, true); > > - if (info->io.genUserClip > 0) > + if (prog->getType() == Program::TYPE_VERTEX && info->io.genUserClip > 0) What about TES? Did you mean && !TYPE_GEOMETRY perhaps? >handleUserClipPlanes(); > > // TODO: for non main function this needs to be a OP_RETURN > @@ -1889,6 +1889,7 @@ Converter::visit(nir_intrinsic_instr *insn) > } > break; > } > + case Program::TYPE_GEOMETRY: > case Program::TYPE_VERTEX: { > if (info->io.genUserClip > 0 && idx == clipVertexOutput) { > mkMov(clipVtx[i], src); > @@ -2187,6 +2188,9 @@ Converter::visit(nir_intrinsic_instr *insn) >break; > } > case nir_intrinsic_emit_vertex: > + if (info->io.genUserClip > 0) > + handleUserClipPlanes(); > + // fallthrough > case nir_intrinsic_end_primitive: { >uint32_t idx = nir_intrinsic_stream_id(insn); >mkOp1(getOperation(op), TYPE_U32, NULL, mkImm(idx))->fixed = 1; > -- > 2.20.1 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 17/34] nv50/ir/nir: implement nir_intrinsic_store_(per_vertex_)output
v3: add workaround for RA issues indirects have to be multiplied by 0x10 fix indirect access v4: use smarter getIndirect helper use storeTo helper v5: don't use const_offset directly v8: don't require C++11 features v9: convert to C++ style comments handle clip planes correctly Signed-off-by: Karol Herbst --- .../nouveau/codegen/nv50_ir_from_nir.cpp | 57 ++- 1 file changed, 56 insertions(+), 1 deletion(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp index dc8dbcfb48b..6e26e00d91f 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp @@ -145,6 +145,8 @@ private: BasicBlock *exit; Value *zero; + int clipVertexOutput; + union { struct { Value *position; @@ -155,7 +157,8 @@ private: Converter::Converter(Program *prog, nir_shader *nir, nv50_ir_prog_info *info) : ConverterCommon(prog, info), nir(nir), - curLoopDepth(0) + curLoopDepth(0), + clipVertexOutput(-1) { zero = mkImm((uint32_t)0); } @@ -1082,9 +1085,16 @@ bool Converter::assignSlots() { case TGSI_SEMANTIC_CLIPDIST: info->io.genUserClip = -1; break; + case TGSI_SEMANTIC_CLIPVERTEX: +clipVertexOutput = vary; +break; case TGSI_SEMANTIC_EDGEFLAG: info->io.edgeFlagOut = vary; break; + case TGSI_SEMANTIC_POSITION: +if (clipVertexOutput < 0) + clipVertexOutput = vary; +break; default: break; } @@ -1346,6 +1356,11 @@ Converter::visit(nir_function *function) setPosition(entry, true); + if (info->io.genUserClip > 0) { + for (int c = 0; c < 4; ++c) + clipVtx[c] = getScratch(); + } + switch (prog->getType()) { case Program::TYPE_TESSELLATION_CONTROL: outBase = mkOp2v( @@ -1372,6 +1387,9 @@ Converter::visit(nir_function *function) bb->cfg.attach(>cfg, Graph::Edge::TREE); setPosition(exit, true); + if (info->io.genUserClip > 0) + handleUserClipPlanes(); + // TODO: for non main function this needs to be a OP_RETURN mkOp(OP_EXIT, TYPE_NONE, NULL)->terminator = 1; return true; @@ -1542,6 +1560,43 @@ Converter::visit(nir_intrinsic_instr *insn) } break; } + case nir_intrinsic_store_output: + case nir_intrinsic_store_per_vertex_output: { + Value *indirect; + DataType dType = getSType(insn->src[0], false, false); + uint32_t idx = getIndirect(insn, op == nir_intrinsic_store_output ? 1 : 2, 0, indirect); + + for (uint8_t i = 0u; i < insn->num_components; ++i) { + if (!((1u << i) & nir_intrinsic_write_mask(insn))) +continue; + + uint8_t offset = 0; + Value *src = getSrc(>src[0], i); + switch (prog->getType()) { + case Program::TYPE_FRAGMENT: { +if (info->out[idx].sn == TGSI_SEMANTIC_POSITION) { + // TGSI uses a different interface than NIR, TGSI stores that + // value in the z component, NIR in X + offset += 2; + src = mkOp1v(OP_SAT, TYPE_F32, getScratch(), src); +} +break; + } + case Program::TYPE_VERTEX: { +if (info->io.genUserClip > 0 && idx == clipVertexOutput) { + mkMov(clipVtx[i], src); + src = clipVtx[i]; +} +break; + } + default: +break; + } + + storeTo(insn, FILE_SHADER_OUTPUT, OP_EXPORT, dType, src, idx, i + offset, indirect); + } + break; + } default: ERROR("unknown nir_intrinsic_op %s\n", nir_intrinsic_infos[op].name); return false; -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 23/34] nv50/ir/nir: add skeleton getOperation for intrinsics
v7: don't assert in default case for getSubOp Signed-off-by: Karol Herbst Reviewed-by: Pierre Moreau --- .../nouveau/codegen/nv50_ir_from_nir.cpp | 22 +++ 1 file changed, 22 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp index 2c4513aad02..ab3bf7f843a 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp @@ -116,10 +116,12 @@ private: std::vector getSTypes(nir_alu_instr *); DataType getSType(nir_src &, bool isFloat, bool isSigned); + operation getOperation(nir_intrinsic_op); operation getOperation(nir_op); operation getOperation(nir_texop); operation preOperationNeeded(nir_op); + int getSubOp(nir_intrinsic_op); int getSubOp(nir_op); CondCode getCondCode(nir_op); @@ -457,6 +459,17 @@ Converter::getOperation(nir_texop op) } } +operation +Converter::getOperation(nir_intrinsic_op op) +{ + switch (op) { + default: + ERROR("couldn't get operation for nir_intrinsic_op %u\n", op); + assert(false); + return OP_NOP; + } +} + operation Converter::preOperationNeeded(nir_op op) { @@ -481,6 +494,15 @@ Converter::getSubOp(nir_op op) } } +int +Converter::getSubOp(nir_intrinsic_op op) +{ + switch (op) { + default: + return 0; + } +} + CondCode Converter::getCondCode(nir_op op) { -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 20/34] nv50/ir/nir: implement loading system values
v2: support more sys values fixed a bug where for multi component reads all values ended up in x v3: add load_patch_vertices_in v4: add subgroup stuff v5: add helper invocation v6: fix loading 64 bit system values v8: don't require C++11 features v9: convert to C++ style comments Signed-off-by: Karol Herbst --- .../nouveau/codegen/nv50_ir_from_nir.cpp | 122 ++ 1 file changed, 122 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp index 5c372794e02..43c9a468f5a 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp @@ -70,6 +70,7 @@ private: LValues& convert(nir_alu_dest *); BasicBlock* convert(nir_block *); LValues& convert(nir_dest *); + SVSemantic convert(nir_intrinsic_op); LValues& convert(nir_register *); LValues& convert(nir_ssa_def *); @@ -1544,6 +1545,70 @@ Converter::visit(nir_instr *insn) return true; } +SVSemantic +Converter::convert(nir_intrinsic_op intr) +{ + switch (intr) { + case nir_intrinsic_load_base_vertex: + return SV_BASEVERTEX; + case nir_intrinsic_load_base_instance: + return SV_BASEINSTANCE; + case nir_intrinsic_load_draw_id: + return SV_DRAWID; + case nir_intrinsic_load_front_face: + return SV_FACE; + case nir_intrinsic_load_helper_invocation: + return SV_THREAD_KILL; + case nir_intrinsic_load_instance_id: + return SV_INSTANCE_ID; + case nir_intrinsic_load_invocation_id: + return SV_INVOCATION_ID; + case nir_intrinsic_load_local_group_size: + return SV_NTID; + case nir_intrinsic_load_local_invocation_id: + return SV_TID; + case nir_intrinsic_load_num_work_groups: + return SV_NCTAID; + case nir_intrinsic_load_patch_vertices_in: + return SV_VERTEX_COUNT; + case nir_intrinsic_load_primitive_id: + return SV_PRIMITIVE_ID; + case nir_intrinsic_load_sample_id: + return SV_SAMPLE_INDEX; + case nir_intrinsic_load_sample_mask_in: + return SV_SAMPLE_MASK; + case nir_intrinsic_load_sample_pos: + return SV_SAMPLE_POS; + case nir_intrinsic_load_subgroup_eq_mask: + return SV_LANEMASK_EQ; + case nir_intrinsic_load_subgroup_ge_mask: + return SV_LANEMASK_GE; + case nir_intrinsic_load_subgroup_gt_mask: + return SV_LANEMASK_GT; + case nir_intrinsic_load_subgroup_le_mask: + return SV_LANEMASK_LE; + case nir_intrinsic_load_subgroup_lt_mask: + return SV_LANEMASK_LT; + case nir_intrinsic_load_subgroup_invocation: + return SV_LANEID; + case nir_intrinsic_load_tess_coord: + return SV_TESS_COORD; + case nir_intrinsic_load_tess_level_inner: + return SV_TESS_INNER; + case nir_intrinsic_load_tess_level_outer: + return SV_TESS_OUTER; + case nir_intrinsic_load_vertex_id: + return SV_VERTEX_ID; + case nir_intrinsic_load_work_group_id: + return SV_CTAID; + default: + ERROR("unknown SVSemantic for nir_intrinsic_op %s\n", +nir_intrinsic_infos[intr].name); + assert(false); + return SV_LAST; + } +} + bool Converter::visit(nir_intrinsic_instr *insn) { @@ -1746,6 +1811,63 @@ Converter::visit(nir_intrinsic_instr *insn) mkOp(OP_DISCARD, TYPE_NONE, NULL)->setPredicate(CC_P, pred); break; } + case nir_intrinsic_load_base_vertex: + case nir_intrinsic_load_base_instance: + case nir_intrinsic_load_draw_id: + case nir_intrinsic_load_front_face: + case nir_intrinsic_load_helper_invocation: + case nir_intrinsic_load_instance_id: + case nir_intrinsic_load_invocation_id: + case nir_intrinsic_load_local_group_size: + case nir_intrinsic_load_local_invocation_id: + case nir_intrinsic_load_num_work_groups: + case nir_intrinsic_load_patch_vertices_in: + case nir_intrinsic_load_primitive_id: + case nir_intrinsic_load_sample_id: + case nir_intrinsic_load_sample_mask_in: + case nir_intrinsic_load_sample_pos: + case nir_intrinsic_load_subgroup_eq_mask: + case nir_intrinsic_load_subgroup_ge_mask: + case nir_intrinsic_load_subgroup_gt_mask: + case nir_intrinsic_load_subgroup_le_mask: + case nir_intrinsic_load_subgroup_lt_mask: + case nir_intrinsic_load_subgroup_invocation: + case nir_intrinsic_load_tess_coord: + case nir_intrinsic_load_tess_level_inner: + case nir_intrinsic_load_tess_level_outer: + case nir_intrinsic_load_vertex_id: + case nir_intrinsic_load_work_group_id: { + const DataType dType = getDType(insn); + SVSemantic sv = convert(op); + LValues = convert(>dest); + + for (uint8_t i = 0u; i < insn->num_components; ++i) { + Value *def; + if (typeSizeof(dType) == 8) +def = getSSA(); + else +def = newDefs[i]; + + if (sv == SV_TID && info->prop.cp.numThreads[i] == 1) { +loadImm(def, 0u); + } else { +Symbol *sym =
[Mesa-dev] [PATCH 29/34] nv50/ir/nir: implement images
v3: fix compiler warnings v4: use loadFrom helper v5: fix signed min/max v6: set tex mask add support for indirect image access set cache mode v7: make compatible with 884d27bcf688d36c3bbe01bceca525595add3b33 rework the whole deref thing to prepare for bindless v8: port to deref instructions don't require C++11 features v9: implement MS images rebase on master (image modifiers) fix regressions due to variable src compnents replace '(*it).' with 'it->' convert to C++ style comments Signed-off-by: Karol Herbst --- .../nouveau/codegen/nv50_ir_from_nir.cpp | 390 +- 1 file changed, 380 insertions(+), 10 deletions(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp index 320f90329ef..ecdc667b25a 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp @@ -36,6 +36,7 @@ #else #include #endif +#include #include namespace { @@ -76,6 +77,8 @@ private: LValues& convert(nir_register *); LValues& convert(nir_ssa_def *); + ImgFormat convertGLImgFormat(GLuint); + Value* getSrc(nir_alu_src *, uint8_t component = 0); Value* getSrc(nir_register *, uint8_t); Value* getSrc(nir_src *, uint8_t, bool indirect = false); @@ -112,6 +115,7 @@ private: DataType getDType(nir_alu_instr *); DataType getDType(nir_intrinsic_instr *); + DataType getDType(nir_intrinsic_instr *, bool isSigned); DataType getDType(nir_op, uint8_t); std::vector getSTypes(nir_alu_instr *); @@ -133,6 +137,7 @@ private: bool visit(nir_alu_instr *); bool visit(nir_block *); bool visit(nir_cf_node *); + bool visit(nir_deref_instr *); bool visit(nir_function *); bool visit(nir_if *); bool visit(nir_instr *); @@ -145,6 +150,11 @@ private: // tex stuff Value* applyProjection(Value *src, Value *proj); + unsigned int getNIRArgCount(TexInstruction::Target&); + + // image stuff + uint16_t handleDeref(nir_deref_instr *, Value * & indirect, const nir_variable * &); + CacheMode getCacheModeFromVar(const nir_variable *); nir_shader *nir; @@ -240,11 +250,30 @@ Converter::getDType(nir_alu_instr *insn) DataType Converter::getDType(nir_intrinsic_instr *insn) +{ + bool isSigned; + switch (insn->intrinsic) { + case nir_intrinsic_shared_atomic_imax: + case nir_intrinsic_shared_atomic_imin: + case nir_intrinsic_ssbo_atomic_imax: + case nir_intrinsic_ssbo_atomic_imin: + isSigned = true; + break; + default: + isSigned = false; + break; + } + + return getDType(insn, isSigned); +} + +DataType +Converter::getDType(nir_intrinsic_instr *insn, bool isSigned) { if (insn->dest.is_ssa) - return typeOfSize(insn->dest.ssa.bit_size / 8, false, false); + return typeOfSize(insn->dest.ssa.bit_size / 8, false, isSigned); else - return typeOfSize(insn->dest.reg.reg->bit_size / 8, false, false); + return typeOfSize(insn->dest.reg.reg->bit_size / 8, false, isSigned); } DataType @@ -469,6 +498,22 @@ Converter::getOperation(nir_intrinsic_op op) return OP_EMIT; case nir_intrinsic_end_primitive: return OP_RESTART; + case nir_intrinsic_image_deref_atomic_add: + case nir_intrinsic_image_deref_atomic_and: + case nir_intrinsic_image_deref_atomic_comp_swap: + case nir_intrinsic_image_deref_atomic_exchange: + case nir_intrinsic_image_deref_atomic_max: + case nir_intrinsic_image_deref_atomic_min: + case nir_intrinsic_image_deref_atomic_or: + case nir_intrinsic_image_deref_atomic_xor: + return OP_SUREDP; + case nir_intrinsic_image_deref_load: + return OP_SULDP; + case nir_intrinsic_image_deref_samples: + case nir_intrinsic_image_deref_size: + return OP_SUQ; + case nir_intrinsic_image_deref_store: + return OP_SUSTP; default: ERROR("couldn't get operation for nir_intrinsic_op %u\n", op); assert(false); @@ -504,24 +549,42 @@ int Converter::getSubOp(nir_intrinsic_op op) { switch (op) { + case nir_intrinsic_image_deref_atomic_add: + case nir_intrinsic_shared_atomic_add: case nir_intrinsic_ssbo_atomic_add: - return NV50_IR_SUBOP_ATOM_ADD; + return NV50_IR_SUBOP_ATOM_ADD; + case nir_intrinsic_image_deref_atomic_and: + case nir_intrinsic_shared_atomic_and: case nir_intrinsic_ssbo_atomic_and: - return NV50_IR_SUBOP_ATOM_AND; + return NV50_IR_SUBOP_ATOM_AND; + case nir_intrinsic_image_deref_atomic_comp_swap: + case nir_intrinsic_shared_atomic_comp_swap: case nir_intrinsic_ssbo_atomic_comp_swap: - return NV50_IR_SUBOP_ATOM_CAS; + return NV50_IR_SUBOP_ATOM_CAS; + case nir_intrinsic_image_deref_atomic_exchange: + case nir_intrinsic_shared_atomic_exchange: case nir_intrinsic_ssbo_atomic_exchange: - return NV50_IR_SUBOP_ATOM_EXCH; + return NV50_IR_SUBOP_ATOM_EXCH; + case
[Mesa-dev] [PATCH 34/34] nv50ir/nir: move immediates before use
Signed-off-by: Karol Herbst --- .../nouveau/codegen/nv50_ir_from_nir.cpp | 59 +-- 1 file changed, 41 insertions(+), 18 deletions(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp index fdc6eaf759a..a16c014c01c 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp @@ -66,6 +66,7 @@ public: private: typedef std::vector LValues; typedef unordered_map NirDefMap; + typedef unordered_map ImmediateMap; typedef unordered_map NirArrayLMemOffsets; typedef unordered_map NirBlockMap; @@ -74,6 +75,7 @@ private: BasicBlock* convert(nir_block *); LValues& convert(nir_dest *); SVSemantic convert(nir_intrinsic_op); + Value* convert(nir_load_const_instr*, uint8_t); LValues& convert(nir_register *); LValues& convert(nir_ssa_def *); @@ -160,12 +162,14 @@ private: NirDefMap ssaDefs; NirDefMap regDefs; + ImmediateMap immediates; NirArrayLMemOffsets regToLmemOffset; NirBlockMap blocks; unsigned int curLoopDepth; BasicBlock *exit; Value *zero; + Instruction *immInsertPos; int clipVertexOutput; @@ -715,6 +719,10 @@ Converter::getSrc(nir_src *src, uint8_t idx, bool indirect) Value* Converter::getSrc(nir_ssa_def *src, uint8_t idx) { + ImmediateMap::iterator iit = immediates.find(src->index); + if (iit != immediates.end()) + return convert((*iit).second, idx); + NirDefMap::iterator it = ssaDefs.find(src->index); if (it == ssaDefs.end()) { ERROR("SSA value %u not found\n", src->index); @@ -1702,6 +1710,8 @@ Converter::visit(nir_loop *loop) bool Converter::visit(nir_instr *insn) { + // we need an insertion point for on the fly generated immediate loads + immInsertPos = bb->getExit(); switch (insn->type) { case nir_instr_type_alu: return visit(nir_instr_as_alu(insn)); @@ -2491,28 +2501,41 @@ Converter::visit(nir_jump_instr *insn) return true; } +Value* +Converter::convert(nir_load_const_instr *insn, uint8_t idx) +{ + Value *val; + + if (immInsertPos) + setPosition(immInsertPos, true); + else + setPosition(bb, false); + + switch (insn->def.bit_size) { + case 64: + val = loadImm(getSSA(8), insn->value.u64[idx]); + break; + case 32: + val = loadImm(getSSA(4), insn->value.u32[idx]); + break; + case 16: + val = loadImm(getSSA(2), insn->value.u16[idx]); + break; + case 8: + val = loadImm(getSSA(1), insn->value.u8[idx]); + break; + default: + unreachable("unhandled bit size!\n"); + } + setPosition(bb, true); + return val; +} + bool Converter::visit(nir_load_const_instr *insn) { assert(insn->def.bit_size <= 64); - - LValues = convert(>def); - for (int i = 0; i < insn->def.num_components; i++) { - switch (insn->def.bit_size) { - case 64: - loadImm(newDefs[i], insn->value.u64[i]); - break; - case 32: - loadImm(newDefs[i], insn->value.u32[i]); - break; - case 16: - loadImm(newDefs[i], insn->value.u16[i]); - break; - case 8: - loadImm(newDefs[i], insn->value.u8[i]); - break; - } - } + immediates[insn->def.index] = insn; return true; } -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 12/34] nv50/ir/nir: parse NIR shader info
v2: parse a few more fields v3: add special handling for GL_ISOLINES v8: set info->prop.fp.readsSampleLocations don't require C++11 features v9: replace '(*it).' with 'it->' convert to C++ style comments Signed-off-by: Karol Herbst --- .../nouveau/codegen/nv50_ir_from_nir.cpp | 323 +- 1 file changed, 320 insertions(+), 3 deletions(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp index d3cba9a63c3..3c5eac17cf9 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp @@ -63,10 +63,12 @@ public: bool run(); private: - typedef std::vector LValues; + typedef std::vector LValues; typedef unordered_map NirDefMap; + typedef unordered_map NirBlockMap; LValues& convert(nir_alu_dest *); + BasicBlock* convert(nir_block *); LValues& convert(nir_dest *); LValues& convert(nir_register *); LValues& convert(nir_ssa_def *); @@ -113,16 +115,48 @@ private: DataType getSType(nir_src &, bool isFloat, bool isSigned); bool assignSlots(); + bool parseNIR(); + + bool visit(nir_block *); + bool visit(nir_cf_node *); + bool visit(nir_function *); + bool visit(nir_if *); + bool visit(nir_instr *); + bool visit(nir_jump_instr *); + bool visit(nir_loop *); nir_shader *nir; NirDefMap ssaDefs; NirDefMap regDefs; + NirBlockMap blocks; + unsigned int curLoopDepth; + + BasicBlock *exit; + + union { + struct { + Value *position; + } fp; + }; }; Converter::Converter(Program *prog, nir_shader *nir, nv50_ir_prog_info *info) : ConverterCommon(prog, info), - nir(nir) {} + nir(nir), + curLoopDepth(0) {} + +BasicBlock * +Converter::convert(nir_block *block) +{ + NirBlockMap::iterator it = blocks.find(block->index); + if (it != blocks.end()) + return it->second; + + BasicBlock *bb = new BasicBlock(func); + blocks[block->index] = bb; + return bb; +} bool Converter::isFloatType(nir_alu_type type) @@ -1041,6 +1075,279 @@ Converter::storeTo(nir_intrinsic_instr *insn, DataFile file, operation op, } } +bool +Converter::parseNIR() +{ + info->io.clipDistances = nir->info.clip_distance_array_size; + info->io.cullDistances = nir->info.cull_distance_array_size; + + switch(prog->getType()) { + case Program::TYPE_COMPUTE: + info->prop.cp.numThreads[0] = nir->info.cs.local_size[0]; + info->prop.cp.numThreads[1] = nir->info.cs.local_size[1]; + info->prop.cp.numThreads[2] = nir->info.cs.local_size[2]; + info->bin.smemSize = nir->info.cs.shared_size; + break; + case Program::TYPE_FRAGMENT: + info->prop.fp.earlyFragTests = nir->info.fs.early_fragment_tests; + info->prop.fp.persampleInvocation = + (nir->info.system_values_read & SYSTEM_BIT_SAMPLE_ID) || + (nir->info.system_values_read & SYSTEM_BIT_SAMPLE_POS); + info->prop.fp.postDepthCoverage = nir->info.fs.post_depth_coverage; + info->prop.fp.readsSampleLocations = + (nir->info.system_values_read & SYSTEM_BIT_SAMPLE_POS); + info->prop.fp.usesDiscard = nir->info.fs.uses_discard; + info->prop.fp.usesSampleMaskIn = + !!(nir->info.system_values_read & SYSTEM_BIT_SAMPLE_MASK_IN); + break; + case Program::TYPE_GEOMETRY: + info->prop.gp.inputPrim = nir->info.gs.input_primitive; + info->prop.gp.instanceCount = nir->info.gs.invocations; + info->prop.gp.maxVertices = nir->info.gs.vertices_out; + info->prop.gp.outputPrim = nir->info.gs.output_primitive; + break; + case Program::TYPE_TESSELLATION_CONTROL: + case Program::TYPE_TESSELLATION_EVAL: + if (nir->info.tess.primitive_mode == GL_ISOLINES) + info->prop.tp.domain = GL_LINES; + else + info->prop.tp.domain = nir->info.tess.primitive_mode; + info->prop.tp.outputPatchSize = nir->info.tess.tcs_vertices_out; + info->prop.tp.outputPrim = + nir->info.tess.point_mode ? PIPE_PRIM_POINTS : PIPE_PRIM_TRIANGLES; + info->prop.tp.partitioning = (nir->info.tess.spacing + 1) % 3; + info->prop.tp.winding = !nir->info.tess.ccw; + break; + case Program::TYPE_VERTEX: + info->prop.vp.usesDrawParameters = + (nir->info.system_values_read & BITFIELD64_BIT(SYSTEM_VALUE_BASE_VERTEX)) || + (nir->info.system_values_read & BITFIELD64_BIT(SYSTEM_VALUE_BASE_INSTANCE)) || + (nir->info.system_values_read & BITFIELD64_BIT(SYSTEM_VALUE_DRAW_ID)); + break; + default: + break; + } + + return true; +} + +bool +Converter::visit(nir_function *function) +{ + // we only support emiting the main function for now + assert(!strcmp(function->name, "main")); + assert(function->impl); + + // usually the blocks will set everything up, but main is special + BasicBlock *entry = new BasicBlock(prog->main); + exit = new
[Mesa-dev] [PATCH 32/34] nv50/ir/nir: implement intrinsic shader_clock
v9: mark as fixed Signed-off-by: Karol Herbst --- src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp | 8 1 file changed, 8 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp index c379eb72c1e..627848a457f 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp @@ -2444,6 +2444,14 @@ Converter::visit(nir_intrinsic_instr *insn) bar->subOp = getSubOp(op); break; } + case nir_intrinsic_shader_clock: { + const DataType dType = getDType(insn); + LValues = convert(>dest); + + loadImm(newDefs[0], 0u); + mkOp1(OP_RDSV, dType, newDefs[1], mkSysVal(SV_CLOCK, 0))->fixed = 1; + break; + } default: ERROR("unknown nir_intrinsic_op %s\n", nir_intrinsic_infos[op].name); return false; -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 25/34] nv50/ir/nir: implement variable indexing
We store those arrays in local memory and reserve some space for each of the arrays. With NIR we could store those arrays packed, but we don't do that yet as it causes MemoryOpt to generate unaligned memory accesses. v3: use fixed size vec4 arrays until we fix MemoryOpt v4: fix for 64 bit types v5: use loadFrom helper v8: don't require C++11 features v9: convert to C++ style comments Signed-off-by: Karol Herbst --- .../nouveau/codegen/nv50_ir_from_nir.cpp | 58 +++ 1 file changed, 58 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp index 7a10a408b70..5b7a3303e78 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp @@ -65,6 +65,7 @@ public: private: typedef std::vector LValues; typedef unordered_map NirDefMap; + typedef unordered_map NirArrayLMemOffsets; typedef unordered_map NirBlockMap; TexTarget convert(glsl_sampler_dim, bool isArray, bool isShadow); @@ -149,6 +150,7 @@ private: NirDefMap ssaDefs; NirDefMap regDefs; + NirArrayLMemOffsets regToLmemOffset; NirBlockMap blocks; unsigned int curLoopDepth; @@ -1353,6 +1355,7 @@ Converter::storeTo(nir_intrinsic_instr *insn, DataFile file, operation op, bool Converter::parseNIR() { + info->bin.tlsSpace = 0; info->io.clipDistances = nir->info.clip_distance_array_size; info->io.cullDistances = nir->info.cull_distance_array_size; @@ -1444,6 +1447,16 @@ Converter::visit(nir_function *function) break; } + nir_foreach_register(reg, >impl->registers) { + if (reg->num_array_elems) { + // TODO: packed variables would be nice, but MemoryOpt fails + // replace 4 with reg->num_components + uint32_t size = 4 * reg->num_array_elems * (reg->bit_size / 8); + regToLmemOffset[reg->index] = info->bin.tlsSpace; + info->bin.tlsSpace += size; + } + } + nir_index_ssa_defs(function->impl); foreach_list_typed(nir_cf_node, node, node, >impl->body) { if (!visit(node)) @@ -2199,6 +2212,51 @@ Converter::visit(nir_alu_instr *insn) // 2. they basically just merge multiple values into one data type case nir_op_imov: case nir_op_fmov: + if (!insn->dest.dest.is_ssa && insn->dest.dest.reg.reg->num_array_elems) { + nir_reg_dest& reg = insn->dest.dest.reg; + uint32_t goffset = regToLmemOffset[reg.reg->index]; + uint8_t comps = reg.reg->num_components; + uint8_t size = reg.reg->bit_size / 8; + uint8_t csize = 4 * size; // TODO after fixing MemoryOpts: comps * size; + uint32_t aoffset = csize * reg.base_offset; + Value *indirect = NULL; + + if (reg.indirect) +indirect = mkOp2v(OP_MUL, TYPE_U32, getSSA(4, FILE_ADDRESS), + getSrc(reg.indirect, 0), mkImm(csize)); + + for (uint8_t i = 0u; i < comps; ++i) { +if (!((1u << i) & insn->dest.write_mask)) + continue; + +Symbol *sym = mkSymbol(FILE_MEMORY_LOCAL, 0, dType, goffset + aoffset + i * size); +mkStore(OP_STORE, dType, sym, indirect, getSrc(>src[0], i)); + } + break; + } else if (!insn->src[0].src.is_ssa && insn->src[0].src.reg.reg->num_array_elems) { + LValues = convert(>dest); + nir_reg_src& reg = insn->src[0].src.reg; + uint32_t goffset = regToLmemOffset[reg.reg->index]; + // uint8_t comps = reg.reg->num_components; + uint8_t size = reg.reg->bit_size / 8; + uint8_t csize = 4 * size; // TODO after fixing MemoryOpts: comps * size; + uint32_t aoffset = csize * reg.base_offset; + Value *indirect = NULL; + + if (reg.indirect) +indirect = mkOp2v(OP_MUL, TYPE_U32, getSSA(4, FILE_ADDRESS), getSrc(reg.indirect, 0), mkImm(csize)); + + for (uint8_t i = 0u; i < newDefs.size(); ++i) +loadFrom(FILE_MEMORY_LOCAL, 0, dType, newDefs[i], goffset + aoffset, i, indirect); + + break; + } else { + LValues = convert(>dest); + for (LValues::size_type c = 0u; c < newDefs.size(); ++c) { +mkMov(newDefs[c], getSrc(>src[0], c), dType); + } + } + break; case nir_op_vec2: case nir_op_vec3: case nir_op_vec4: { -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 24/34] nv50/ir/nir: implement vote and ballot
v2: add vote_eq support use the new subop intrinsic helper add ballot v3: add read_(first_)invocation v8: handle vectorized intrinsics don't require C++11 features v9: lower_subgroups to 32 bit (produces less instructions) use getSSA and getScratch instead of new_LValue Signed-off-by: Karol Herbst --- .../nouveau/codegen/nv50_ir_from_nir.cpp | 48 +++ 1 file changed, 48 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp index ab3bf7f843a..7a10a408b70 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp @@ -498,6 +498,12 @@ int Converter::getSubOp(nir_intrinsic_op op) { switch (op) { + case nir_intrinsic_vote_all: + return NV50_IR_SUBOP_VOTE_ALL; + case nir_intrinsic_vote_any: + return NV50_IR_SUBOP_VOTE_ANY; + case nir_intrinsic_vote_ieq: + return NV50_IR_SUBOP_VOTE_UNI; default: return 0; } @@ -1931,6 +1937,42 @@ Converter::visit(nir_intrinsic_instr *insn) loadImm(newDefs[0], 32u); break; } + case nir_intrinsic_vote_all: + case nir_intrinsic_vote_any: + case nir_intrinsic_vote_ieq: { + LValues = convert(>dest); + Value *pred = getScratch(1, FILE_PREDICATE); + mkCmp(OP_SET, CC_NE, TYPE_U32, pred, TYPE_U32, getSrc(>src[0], 0), zero); + mkOp1(OP_VOTE, TYPE_U32, pred, pred)->subOp = getSubOp(op); + mkCvt(OP_CVT, TYPE_U32, newDefs[0], TYPE_U8, pred); + break; + } + case nir_intrinsic_ballot: { + LValues = convert(>dest); + Value *pred = getSSA(1, FILE_PREDICATE); + mkCmp(OP_SET, CC_NE, TYPE_U32, pred, TYPE_U32, getSrc(>src[0], 0), zero); + mkOp1(OP_VOTE, TYPE_U32, newDefs[0], pred)->subOp = NV50_IR_SUBOP_VOTE_ANY; + break; + } + case nir_intrinsic_read_first_invocation: + case nir_intrinsic_read_invocation: { + LValues = convert(>dest); + const DataType dType = getDType(insn); + Value *tmp = getScratch(); + + if (op == nir_intrinsic_read_first_invocation) { + mkOp1(OP_VOTE, TYPE_U32, tmp, mkImm(1))->subOp = NV50_IR_SUBOP_VOTE_ANY; + mkOp2(OP_EXTBF, TYPE_U32, tmp, tmp, mkImm(0x2000))->subOp = NV50_IR_SUBOP_EXTBF_REV; + mkOp1(OP_BFIND, TYPE_U32, tmp, tmp)->subOp = NV50_IR_SUBOP_BFIND_SAMT; + } else + tmp = getSrc(>src[1], 0); + + for (uint8_t i = 0; i < insn->num_components; ++i) { + mkOp3(OP_SHFL, dType, newDefs[i], getSrc(>src[0], i), tmp, mkImm(0x1f)) +->subOp = NV50_IR_SUBOP_SHFL_IDX; + } + break; + } default: ERROR("unknown nir_intrinsic_op %s\n", nir_intrinsic_infos[op].name); return false; @@ -2566,7 +2608,13 @@ Converter::run() if (prog->dbgFlags & NV50_IR_DEBUG_VERBOSE) nir_print_shader(nir, stderr); + struct nir_lower_subgroups_options subgroup_options = { + .subgroup_size = 32, + .ballot_bit_size = 32, + }; + NIR_PASS_V(nir, nir_lower_io, nir_var_all, type_size, (nir_lower_io_options)0); + NIR_PASS_V(nir, nir_lower_subgroups, _options); NIR_PASS_V(nir, nir_lower_regs_to_ssa); NIR_PASS_V(nir, nir_lower_load_const_to_scalar); NIR_PASS_V(nir, nir_lower_vars_to_ssa); -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 26/34] nv50/ir/nir: implement geometry shader nir_intrinsics
v4: use smarter getIndirect helper use new getSlotAddress helper use loadFrom helper v8: don't require C++11 features Signed-off-by: Karol Herbst --- .../nouveau/codegen/nv50_ir_from_nir.cpp | 27 +++ 1 file changed, 27 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp index 5b7a3303e78..991c1283a0f 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp @@ -465,6 +465,10 @@ operation Converter::getOperation(nir_intrinsic_op op) { switch (op) { + case nir_intrinsic_emit_vertex: + return OP_EMIT; + case nir_intrinsic_end_primitive: + return OP_RESTART; default: ERROR("couldn't get operation for nir_intrinsic_op %u\n", op); assert(false); @@ -1986,6 +1990,29 @@ Converter::visit(nir_intrinsic_instr *insn) } break; } + case nir_intrinsic_load_per_vertex_input: { + const DataType dType = getDType(insn); + LValues = convert(>dest); + Value *indirectVertex; + Value *indirectOffset; + uint32_t baseVertex = getIndirect(>src[0], 0, indirectVertex); + uint32_t idx = getIndirect(insn, 1, 0, indirectOffset); + + Value *vtxBase = mkOp2v(OP_PFETCH, TYPE_U32, getSSA(4, FILE_ADDRESS), + mkImm(baseVertex), indirectVertex); + for (uint8_t i = 0u; i < insn->num_components; ++i) { + uint32_t address = getSlotAddress(insn, idx, i); + loadFrom(FILE_SHADER_INPUT, 0, dType, newDefs[i], address, 0, + indirectOffset, vtxBase, info->in[idx].patch); + } + break; + } + case nir_intrinsic_emit_vertex: + case nir_intrinsic_end_primitive: { + uint32_t idx = nir_intrinsic_stream_id(insn); + mkOp1(getOperation(op), TYPE_U32, NULL, mkImm(idx))->fixed = 1; + break; + } default: ERROR("unknown nir_intrinsic_op %s\n", nir_intrinsic_infos[op].name); return false; -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 21/34] nv50/ir/nir: implement nir_ssa_undef_instr
v2: use mkOp v8: don't require C++11 features Signed-off-by: Karol Herbst Reviewed-by: Pierre Moreau --- .../drivers/nouveau/codegen/nv50_ir_from_nir.cpp| 13 + 1 file changed, 13 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp index 43c9a468f5a..2ed508bbc2d 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp @@ -135,6 +135,7 @@ private: bool visit(nir_jump_instr *); bool visit(nir_load_const_instr*); bool visit(nir_loop *); + bool visit(nir_ssa_undef_instr *); nir_shader *nir; @@ -1538,6 +1539,8 @@ Converter::visit(nir_instr *insn) return visit(nir_instr_as_jump(insn)); case nir_instr_type_load_const: return visit(nir_instr_as_load_const(insn)); + case nir_instr_type_ssa_undef: + return visit(nir_instr_as_ssa_undef(insn)); default: ERROR("unknown nir_instr type %u\n", insn->type); return false; @@ -2289,6 +2292,16 @@ Converter::visit(nir_alu_instr *insn) } #undef DEFAULT_CHECKS +bool +Converter::visit(nir_ssa_undef_instr *insn) +{ + LValues = convert(>def); + for (uint8_t i = 0u; i < insn->def.num_components; ++i) { + mkOp(OP_NOP, TYPE_NONE, newDefs[i]); + } + return true; +} + bool Converter::run() { -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 13/34] nv50/ir/nir: implement nir_load_const_instr
v8: fix loading 8/16 bit constants Signed-off-by: Karol Herbst Reviewed-by: Pierre Moreau --- .../nouveau/codegen/nv50_ir_from_nir.cpp | 28 +++ 1 file changed, 28 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp index 3c5eac17cf9..3fa590a4655 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp @@ -123,6 +123,7 @@ private: bool visit(nir_if *); bool visit(nir_instr *); bool visit(nir_jump_instr *); + bool visit(nir_load_const_instr*); bool visit(nir_loop *); nir_shader *nir; @@ -1314,6 +1315,8 @@ Converter::visit(nir_instr *insn) switch (insn->type) { case nir_instr_type_jump: return visit(nir_instr_as_jump(insn)); + case nir_instr_type_load_const: + return visit(nir_instr_as_load_const(insn)); default: ERROR("unknown nir_instr type %u\n", insn->type); return false; @@ -1348,6 +1351,31 @@ Converter::visit(nir_jump_instr *insn) return true; } +bool +Converter::visit(nir_load_const_instr *insn) +{ + assert(insn->def.bit_size <= 64); + + LValues = convert(>def); + for (int i = 0; i < insn->def.num_components; i++) { + switch (insn->def.bit_size) { + case 64: + loadImm(newDefs[i], insn->value.u64[i]); + break; + case 32: + loadImm(newDefs[i], insn->value.u32[i]); + break; + case 16: + loadImm(newDefs[i], insn->value.u16[i]); + break; + case 8: + loadImm(newDefs[i], insn->value.u8[i]); + break; + } + } + return true; +} + bool Converter::run() { -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 22/34] nv50/ir/nir: implement nir_instr_type_tex
a lot of those fields are not valid for a lot of tex ops. Not quite sure if it's worth the effort to check for those or just keep it like that. It seems to kind of work. v2: reworked offset handling add tex support with indirect R/S arguments handle GLSL_SAMPLER_DIM_EXTERNAL drop reference in convert(glsl_sampler_dim&, bool, bool) fix tg4 component selection v5: fill up coords args with scratch values if coords provided is less than TexTarget.getArgCount() v7: prepare for bindless_texture support v8: don't require C++11 features v9: convert to C++ style comments fix txf with a uniform constant 0 lod Signed-off-by: Karol Herbst --- .../nouveau/codegen/nv50_ir_from_nir.cpp | 234 ++ 1 file changed, 234 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp index 2ed508bbc2d..2c4513aad02 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp @@ -67,6 +67,7 @@ private: typedef unordered_map NirDefMap; typedef unordered_map NirBlockMap; + TexTarget convert(glsl_sampler_dim, bool isArray, bool isShadow); LValues& convert(nir_alu_dest *); BasicBlock* convert(nir_block *); LValues& convert(nir_dest *); @@ -116,6 +117,7 @@ private: DataType getSType(nir_src &, bool isFloat, bool isSigned); operation getOperation(nir_op); + operation getOperation(nir_texop); operation preOperationNeeded(nir_op); int getSubOp(nir_op); @@ -136,6 +138,10 @@ private: bool visit(nir_load_const_instr*); bool visit(nir_loop *); bool visit(nir_ssa_undef_instr *); + bool visit(nir_tex_instr *); + + // tex stuff + Value* applyProjection(Value *src, Value *proj); nir_shader *nir; @@ -421,6 +427,36 @@ Converter::getOperation(nir_op op) } } +operation +Converter::getOperation(nir_texop op) +{ + switch (op) { + case nir_texop_tex: + return OP_TEX; + case nir_texop_lod: + return OP_TXLQ; + case nir_texop_txb: + return OP_TXB; + case nir_texop_txd: + return OP_TXD; + case nir_texop_txf: + case nir_texop_txf_ms: + return OP_TXF; + case nir_texop_tg4: + return OP_TXG; + case nir_texop_txl: + return OP_TXL; + case nir_texop_query_levels: + case nir_texop_texture_samples: + case nir_texop_txs: + return OP_TXQ; + default: + ERROR("couldn't get operation for nir_texop %u\n", op); + assert(false); + return OP_NOP; + } +} + operation Converter::preOperationNeeded(nir_op op) { @@ -1541,6 +1577,8 @@ Converter::visit(nir_instr *insn) return visit(nir_instr_as_load_const(insn)); case nir_instr_type_ssa_undef: return visit(nir_instr_as_ssa_undef(insn)); + case nir_instr_type_tex: + return visit(nir_instr_as_tex(insn)); default: ERROR("unknown nir_instr type %u\n", insn->type); return false; @@ -2302,6 +2340,202 @@ Converter::visit(nir_ssa_undef_instr *insn) return true; } +#define CASE_SAMPLER(ty) \ + case GLSL_SAMPLER_DIM_ ## ty : \ + if (isArray && !isShadow) \ + return TEX_TARGET_ ## ty ## _ARRAY; \ + else if (!isArray && isShadow) \ + return TEX_TARGET_## ty ## _SHADOW; \ + else if (isArray && isShadow) \ + return TEX_TARGET_## ty ## _ARRAY_SHADOW; \ + else \ + return TEX_TARGET_ ## ty + +TexTarget +Converter::convert(glsl_sampler_dim dim, bool isArray, bool isShadow) +{ + switch (dim) { + CASE_SAMPLER(1D); + CASE_SAMPLER(2D); + CASE_SAMPLER(CUBE); + case GLSL_SAMPLER_DIM_3D: + return TEX_TARGET_3D; + case GLSL_SAMPLER_DIM_MS: + if (isArray) + return TEX_TARGET_2D_MS_ARRAY; + return TEX_TARGET_2D_MS; + case GLSL_SAMPLER_DIM_RECT: + if (isShadow) + return TEX_TARGET_RECT_SHADOW; + return TEX_TARGET_RECT; + case GLSL_SAMPLER_DIM_BUF: + return TEX_TARGET_BUFFER; + case GLSL_SAMPLER_DIM_EXTERNAL: + return TEX_TARGET_2D; + default: + ERROR("unknown glsl_sampler_dim %u\n", dim); + assert(false); + return TEX_TARGET_COUNT; + } +} +#undef CASE_SAMPLER + +Value* +Converter::applyProjection(Value *src, Value *proj) +{ + if (!proj) + return src; + return mkOp2v(OP_MUL, TYPE_F32, getScratch(), src, proj); +} + +bool +Converter::visit(nir_tex_instr *insn) +{ + switch (insn->op) { + case nir_texop_lod: + case nir_texop_query_levels: + case nir_texop_tex: + case nir_texop_texture_samples: + case nir_texop_tg4: + case nir_texop_txb: + case nir_texop_txd: + case nir_texop_txf: + case nir_texop_txf_ms: + case nir_texop_txl: + case nir_texop_txs: { + LValues = convert(>dest); + std::vector srcs; + std::vector defs; + std::vector offsets; + uint8_t mask = 0; + bool lz = false; + Value *proj = NULL; + TexInstruction::Target target =
[Mesa-dev] [PATCH 09/34] nv50/ir/nir: add nir type helper functions
v4: treat imul as unsigned v5: remove pointless !! v7: inot is unsigned as well v8: don't require C++11 features v9: convert to C++ style comments improve formatting print error in all cases where codegen doesn't support a given type Signed-off-by: Karol Herbst Acked-by: Pierre Moreau --- .../nouveau/codegen/nv50_ir_from_nir.cpp | 127 ++ 1 file changed, 127 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp index f7908876e96..2ac6d8c1d07 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp @@ -86,6 +86,18 @@ private: uint32_t getIndirect(nir_src *, uint8_t, Value *&); uint32_t getIndirect(nir_intrinsic_instr *, uint8_t s, uint8_t c, Value *&); + bool isFloatType(nir_alu_type); + bool isSignedType(nir_alu_type); + bool isResultFloat(nir_op); + bool isResultSigned(nir_op); + + DataType getDType(nir_alu_instr *); + DataType getDType(nir_intrinsic_instr *); + DataType getDType(nir_op, uint8_t); + + std::vector getSTypes(nir_alu_instr *); + DataType getSType(nir_src &, bool isFloat, bool isSigned); + nir_shader *nir; NirDefMap ssaDefs; @@ -96,6 +108,121 @@ Converter::Converter(Program *prog, nir_shader *nir, nv50_ir_prog_info *info) : ConverterCommon(prog, info), nir(nir) {} +bool +Converter::isFloatType(nir_alu_type type) +{ + return nir_alu_type_get_base_type(type) == nir_type_float; +} + +bool +Converter::isSignedType(nir_alu_type type) +{ + return nir_alu_type_get_base_type(type) == nir_type_int; +} + +bool +Converter::isResultFloat(nir_op op) +{ + const nir_op_info = nir_op_infos[op]; + if (info.output_type != nir_type_invalid) + return isFloatType(info.output_type); + + ERROR("isResultFloat not implemented for %s\n", nir_op_infos[op].name); + assert(false); + return true; +} + +bool +Converter::isResultSigned(nir_op op) +{ + switch (op) { + // there is no umul and we get wrong results if we treat all muls as signed + case nir_op_imul: + case nir_op_inot: + return false; + default: + const nir_op_info = nir_op_infos[op]; + if (info.output_type != nir_type_invalid) + return isSignedType(info.output_type); + ERROR("isResultSigned not implemented for %s\n", nir_op_infos[op].name); + assert(false); + return true; + } +} + +DataType +Converter::getDType(nir_alu_instr *insn) +{ + if (insn->dest.dest.is_ssa) + return getDType(insn->op, insn->dest.dest.ssa.bit_size); + else + return getDType(insn->op, insn->dest.dest.reg.reg->bit_size); +} + +DataType +Converter::getDType(nir_intrinsic_instr *insn) +{ + if (insn->dest.is_ssa) + return typeOfSize(insn->dest.ssa.bit_size / 8, false, false); + else + return typeOfSize(insn->dest.reg.reg->bit_size / 8, false, false); +} + +DataType +Converter::getDType(nir_op op, uint8_t bitSize) +{ + DataType ty = typeOfSize(bitSize / 8, isResultFloat(op), isResultSigned(op)); + if (ty == TYPE_NONE) { + ERROR("couldn't get Type for op %s with bitSize %u\n", nir_op_infos[op].name, bitSize); + assert(false); + } + return ty; +} + +std::vector +Converter::getSTypes(nir_alu_instr *insn) +{ + const nir_op_info = nir_op_infos[insn->op]; + std::vector res(info.num_inputs); + + for (uint8_t i = 0; i < info.num_inputs; ++i) { + if (info.input_types[i] != nir_type_invalid) { + res[i] = getSType(insn->src[i].src, isFloatType(info.input_types[i]), isSignedType(info.input_types[i])); + } else { + ERROR("getSType not implemented for %s idx %u\n", info.name, i); + assert(false); + res[i] = TYPE_NONE; + break; + } + } + + return res; +} + +DataType +Converter::getSType(nir_src , bool isFloat, bool isSigned) +{ + uint8_t bitSize; + if (src.is_ssa) + bitSize = src.ssa->bit_size; + else + bitSize = src.reg.reg->bit_size; + + DataType ty = typeOfSize(bitSize / 8, isFloat, isSigned); + if (ty == TYPE_NONE) { + const char *str; + if (isFloat) + str = "float"; + else if (isSigned) + str = "int"; + else + str = "uint"; + ERROR("couldn't get Type for %s with bitSize %u\n", str, bitSize); + assert(false); + } + return ty; +} + Converter::LValues& Converter::convert(nir_dest *dest) { -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 19/34] nv50/ir/nir: implement intrinsic_discard(_if)
v9: use getSSA instead of new_LValue Signed-off-by: Karol Herbst --- .../drivers/nouveau/codegen/nv50_ir_from_nir.cpp | 14 ++ 1 file changed, 14 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp index 70c4aecd699..5c372794e02 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp @@ -1732,6 +1732,20 @@ Converter::visit(nir_intrinsic_instr *insn) loadImm(newDefs[1], mode); break; } + case nir_intrinsic_discard: + mkOp(OP_DISCARD, TYPE_NONE, NULL); + break; + case nir_intrinsic_discard_if: { + Value *pred = getSSA(1, FILE_PREDICATE); + if (insn->num_components > 1) { + ERROR("nir_intrinsic_discard_if only with 1 component supported!\n"); + assert(false); + return false; + } + mkCmp(OP_SET, CC_NE, TYPE_U8, pred, TYPE_U32, getSrc(>src[0], 0), zero); + mkOp(OP_DISCARD, TYPE_NONE, NULL)->setPredicate(CC_P, pred); + break; + } default: ERROR("unknown nir_intrinsic_op %s\n", nir_intrinsic_infos[op].name); return false; -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 18/34] nv50/ir/nir: implement load_(interpolated_)input/output
v3: and load_output v4: use smarter getIndirect helper use new getSlotAddress helper v5: don't use const_offset directly fix for indirects v6: add support for interpolateAt v7: fix compiler warnings add load_barycentric_sample handle load_output for fragment shaders v8: set info->prop.fp.readsSampleLocations for at_sample interpolation don't require C++11 features v9: convert to C++ style comments Signed-off-by: Karol Herbst --- .../nouveau/codegen/nv50_ir_from_nir.cpp | 135 ++ 1 file changed, 135 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp index 6e26e00d91f..70c4aecd699 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp @@ -1597,6 +1597,141 @@ Converter::visit(nir_intrinsic_instr *insn) } break; } + case nir_intrinsic_load_input: + case nir_intrinsic_load_interpolated_input: + case nir_intrinsic_load_output: { + LValues = convert(>dest); + + // FBFetch + if (prog->getType() == Program::TYPE_FRAGMENT && + op == nir_intrinsic_load_output) { + std::vector defs, srcs; + uint8_t mask = 0; + + srcs.push_back(getSSA()); + srcs.push_back(getSSA()); + Value *x = mkOp1v(OP_RDSV, TYPE_F32, getSSA(), mkSysVal(SV_POSITION, 0)); + Value *y = mkOp1v(OP_RDSV, TYPE_F32, getSSA(), mkSysVal(SV_POSITION, 1)); + mkCvt(OP_CVT, TYPE_U32, srcs[0], TYPE_F32, x)->rnd = ROUND_Z; + mkCvt(OP_CVT, TYPE_U32, srcs[1], TYPE_F32, y)->rnd = ROUND_Z; + + srcs.push_back(mkOp1v(OP_RDSV, TYPE_U32, getSSA(), mkSysVal(SV_LAYER, 0))); + srcs.push_back(mkOp1v(OP_RDSV, TYPE_U32, getSSA(), mkSysVal(SV_SAMPLE_INDEX, 0))); + + for (uint8_t i = 0u; i < insn->num_components; ++i) { +defs.push_back(newDefs[i]); +mask |= 1 << i; + } + + TexInstruction *texi = mkTex(OP_TXF, TEX_TARGET_2D_MS_ARRAY, 0, 0, defs, srcs); + texi->tex.levelZero = 1; + texi->tex.mask = mask; + texi->tex.useOffsets = 0; + texi->tex.r = 0x; + texi->tex.s = 0x; + + info->prop.fp.readsFramebuffer = true; + break; + } + + const DataType dType = getDType(insn); + Value *indirect; + bool input = op != nir_intrinsic_load_output; + operation nvirOp; + uint32_t mode = 0; + + uint32_t idx = getIndirect(insn, op == nir_intrinsic_load_interpolated_input ? 1 : 0, 0, indirect); + nv50_ir_varying& vary = input ? info->in[idx] : info->out[idx]; + + // see load_barycentric_* handling + if (prog->getType() == Program::TYPE_FRAGMENT) { + mode = translateInterpMode(, nvirOp); + if (op == nir_intrinsic_load_interpolated_input) { +ImmediateValue immMode; +if (getSrc(>src[0], 1)->getUniqueInsn()->src(0).getImmediate(immMode)) + mode |= immMode.reg.data.u32; + } + } + + for (uint8_t i = 0u; i < insn->num_components; ++i) { + uint32_t address = getSlotAddress(insn, idx, i); + Symbol *sym = mkSymbol(input ? FILE_SHADER_INPUT : FILE_SHADER_OUTPUT, 0, dType, address); + if (prog->getType() == Program::TYPE_FRAGMENT) { +int s = 1; +if (typeSizeof(dType) == 8) { + Value *lo = getSSA(); + Value *hi = getSSA(); + Instruction *interp; + + interp = mkOp1(nvirOp, TYPE_U32, lo, sym); + if (nvirOp == OP_PINTERP) + interp->setSrc(s++, fp.position); + if (mode & NV50_IR_INTERP_OFFSET) + interp->setSrc(s++, getSrc(>src[0], 0)); + interp->setInterpolate(mode); + interp->setIndirect(0, 0, indirect); + + Symbol *sym1 = mkSymbol(input ? FILE_SHADER_INPUT : FILE_SHADER_OUTPUT, 0, dType, address + 4); + interp = mkOp1(nvirOp, TYPE_U32, hi, sym1); + if (nvirOp == OP_PINTERP) + interp->setSrc(s++, fp.position); + if (mode & NV50_IR_INTERP_OFFSET) + interp->setSrc(s++, getSrc(>src[0], 0)); + interp->setInterpolate(mode); + interp->setIndirect(0, 0, indirect); + + mkOp2(OP_MERGE, dType, newDefs[i], lo, hi); +} else { + Instruction *interp = mkOp1(nvirOp, dType, newDefs[i], sym); + if (nvirOp == OP_PINTERP) + interp->setSrc(s++, fp.position); + if (mode & NV50_IR_INTERP_OFFSET) + interp->setSrc(s++, getSrc(>src[0], 0)); + interp->setInterpolate(mode); + interp->setIndirect(0, 0, indirect); +} + } else { +mkLoad(dType, newDefs[i], sym, indirect)->perPatch = vary.patch; +
[Mesa-dev] [PATCH 16/34] nv50/ir/nir: implement nir_intrinsic_load_uniform
v2: use new getIndirect helper fixes symbols for 64 bit types v4: use smarter getIndirect helper simplify address calculation use loadFrom helper v8: don't require C++11 features Signed-off-by: Karol Herbst --- .../drivers/nouveau/codegen/nv50_ir_from_nir.cpp | 10 ++ 1 file changed, 10 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp index a553e42e08a..dc8dbcfb48b 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp @@ -1532,6 +1532,16 @@ Converter::visit(nir_intrinsic_instr *insn) nir_intrinsic_op op = insn->intrinsic; switch (op) { + case nir_intrinsic_load_uniform: { + LValues = convert(>dest); + const DataType dType = getDType(insn); + Value *indirect; + uint32_t coffset = getIndirect(insn, 0, 0, indirect); + for (uint8_t i = 0; i < insn->num_components; ++i) { + loadFrom(FILE_MEMORY_CONST, 0, dType, newDefs[i], 16 * coffset, i, indirect); + } + break; + } default: ERROR("unknown nir_intrinsic_op %s\n", nir_intrinsic_infos[op].name); return false; -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 14/34] nv50/ir/nir: add skeleton for nir_intrinsic_instr
Signed-off-by: Karol Herbst Reviewed-by: Pierre Moreau --- .../nouveau/codegen/nv50_ir_from_nir.cpp| 17 + 1 file changed, 17 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp index 3fa590a4655..a99f3bbbc05 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp @@ -122,6 +122,7 @@ private: bool visit(nir_function *); bool visit(nir_if *); bool visit(nir_instr *); + bool visit(nir_intrinsic_instr *); bool visit(nir_jump_instr *); bool visit(nir_load_const_instr*); bool visit(nir_loop *); @@ -1313,6 +1314,8 @@ bool Converter::visit(nir_instr *insn) { switch (insn->type) { + case nir_instr_type_intrinsic: + return visit(nir_instr_as_intrinsic(insn)); case nir_instr_type_jump: return visit(nir_instr_as_jump(insn)); case nir_instr_type_load_const: @@ -1324,6 +1327,20 @@ Converter::visit(nir_instr *insn) return true; } +bool +Converter::visit(nir_intrinsic_instr *insn) +{ + nir_intrinsic_op op = insn->intrinsic; + + switch (op) { + default: + ERROR("unknown nir_intrinsic_op %s\n", nir_intrinsic_infos[op].name); + return false; + } + + return true; +} + bool Converter::visit(nir_jump_instr *insn) { -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 30/34] nv50/ir/nir: add memory barriers
v5: add more barrier intrinsics Signed-off-by: Karol Herbst --- .../nouveau/codegen/nv50_ir_from_nir.cpp | 21 +++ 1 file changed, 21 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp index ecdc667b25a..ad68fb4505f 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp @@ -585,6 +585,16 @@ Converter::getSubOp(nir_intrinsic_op op) case nir_intrinsic_shared_atomic_xor: case nir_intrinsic_ssbo_atomic_xor: return NV50_IR_SUBOP_ATOM_XOR; + + case nir_intrinsic_group_memory_barrier: + case nir_intrinsic_memory_barrier: + case nir_intrinsic_memory_barrier_atomic_counter: + case nir_intrinsic_memory_barrier_buffer: + case nir_intrinsic_memory_barrier_image: + return NV50_IR_SUBOP_MEMBAR(M, GL); + case nir_intrinsic_memory_barrier_shared: + return NV50_IR_SUBOP_MEMBAR(M, CTA); + case nir_intrinsic_vote_all: return NV50_IR_SUBOP_VOTE_ALL; case nir_intrinsic_vote_any: @@ -2400,6 +2410,17 @@ Converter::visit(nir_intrinsic_instr *insn) bar->subOp = NV50_IR_SUBOP_BAR_SYNC; break; } + case nir_intrinsic_group_memory_barrier: + case nir_intrinsic_memory_barrier: + case nir_intrinsic_memory_barrier_atomic_counter: + case nir_intrinsic_memory_barrier_buffer: + case nir_intrinsic_memory_barrier_image: + case nir_intrinsic_memory_barrier_shared: { + Instruction *bar = mkOp(OP_MEMBAR, TYPE_NONE, NULL); + bar->fixed = 1; + bar->subOp = getSubOp(op); + break; + } default: ERROR("unknown nir_intrinsic_op %s\n", nir_intrinsic_infos[op].name); return false; -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 33/34] nv50/ir/nir: handle user clip planes for each emitted vertex
v9: convert to C++ style comments Signed-off-by: Karol Herbst --- src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp index 627848a457f..fdc6eaf759a 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp @@ -1561,7 +1561,7 @@ Converter::visit(nir_function *function) bb->cfg.attach(>cfg, Graph::Edge::TREE); setPosition(exit, true); - if (info->io.genUserClip > 0) + if (prog->getType() == Program::TYPE_VERTEX && info->io.genUserClip > 0) handleUserClipPlanes(); // TODO: for non main function this needs to be a OP_RETURN @@ -1889,6 +1889,7 @@ Converter::visit(nir_intrinsic_instr *insn) } break; } + case Program::TYPE_GEOMETRY: case Program::TYPE_VERTEX: { if (info->io.genUserClip > 0 && idx == clipVertexOutput) { mkMov(clipVtx[i], src); @@ -2187,6 +2188,9 @@ Converter::visit(nir_intrinsic_instr *insn) break; } case nir_intrinsic_emit_vertex: + if (info->io.genUserClip > 0) + handleUserClipPlanes(); + // fallthrough case nir_intrinsic_end_primitive: { uint32_t idx = nir_intrinsic_stream_id(insn); mkOp1(getOperation(op), TYPE_U32, NULL, mkImm(idx))->fixed = 1; -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 28/34] nv50/ir/nir: implement ssbo intrinsics
v4: use loadFrom helper v5: support indirect buffer access v8: don't require C++11 features Signed-off-by: Karol Herbst --- .../nouveau/codegen/nv50_ir_from_nir.cpp | 90 +++ 1 file changed, 90 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp index 11403bea674..320f90329ef 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp @@ -504,6 +504,24 @@ int Converter::getSubOp(nir_intrinsic_op op) { switch (op) { + case nir_intrinsic_ssbo_atomic_add: + return NV50_IR_SUBOP_ATOM_ADD; + case nir_intrinsic_ssbo_atomic_and: + return NV50_IR_SUBOP_ATOM_AND; + case nir_intrinsic_ssbo_atomic_comp_swap: + return NV50_IR_SUBOP_ATOM_CAS; + case nir_intrinsic_ssbo_atomic_exchange: + return NV50_IR_SUBOP_ATOM_EXCH; + case nir_intrinsic_ssbo_atomic_or: + return NV50_IR_SUBOP_ATOM_OR; + case nir_intrinsic_ssbo_atomic_imax: + case nir_intrinsic_ssbo_atomic_umax: + return NV50_IR_SUBOP_ATOM_MAX; + case nir_intrinsic_ssbo_atomic_imin: + case nir_intrinsic_ssbo_atomic_umin: + return NV50_IR_SUBOP_ATOM_MIN; + case nir_intrinsic_ssbo_atomic_xor: + return NV50_IR_SUBOP_ATOM_XOR; case nir_intrinsic_vote_all: return NV50_IR_SUBOP_VOTE_ALL; case nir_intrinsic_vote_any: @@ -2027,6 +2045,78 @@ Converter::visit(nir_intrinsic_instr *insn) } break; } + case nir_intrinsic_get_buffer_size: { + LValues = convert(>dest); + const DataType dType = getDType(insn); + Value *indirectBuffer; + uint32_t buffer = getIndirect(>src[0], 0, indirectBuffer); + + Symbol *sym = mkSymbol(FILE_MEMORY_BUFFER, buffer, dType, 0); + mkOp1(OP_BUFQ, dType, newDefs[0], sym)->setIndirect(0, 0, indirectBuffer); + break; + } + case nir_intrinsic_store_ssbo: { + DataType sType = getSType(insn->src[0], false, false); + Value *indirectBuffer; + Value *indirectOffset; + uint32_t buffer = getIndirect(>src[1], 0, indirectBuffer); + uint32_t offset = getIndirect(>src[2], 0, indirectOffset); + + for (uint8_t i = 0u; i < insn->num_components; ++i) { + if (!((1u << i) & nir_intrinsic_write_mask(insn))) +continue; + Symbol *sym = mkSymbol(FILE_MEMORY_BUFFER, buffer, sType, +offset + i * typeSizeof(sType)); + mkStore(OP_STORE, sType, sym, indirectOffset, getSrc(>src[0], i)) +->setIndirect(0, 1, indirectBuffer); + } + info->io.globalAccess |= 0x2; + break; + } + case nir_intrinsic_load_ssbo: { + const DataType dType = getDType(insn); + LValues = convert(>dest); + Value *indirectBuffer; + Value *indirectOffset; + uint32_t buffer = getIndirect(>src[0], 0, indirectBuffer); + uint32_t offset = getIndirect(>src[1], 0, indirectOffset); + + for (uint8_t i = 0u; i < insn->num_components; ++i) + loadFrom(FILE_MEMORY_BUFFER, buffer, dType, newDefs[i], offset, i, + indirectOffset, indirectBuffer); + + info->io.globalAccess |= 0x1; + break; + } + case nir_intrinsic_ssbo_atomic_add: + case nir_intrinsic_ssbo_atomic_and: + case nir_intrinsic_ssbo_atomic_comp_swap: + case nir_intrinsic_ssbo_atomic_exchange: + case nir_intrinsic_ssbo_atomic_or: + case nir_intrinsic_ssbo_atomic_imax: + case nir_intrinsic_ssbo_atomic_imin: + case nir_intrinsic_ssbo_atomic_umax: + case nir_intrinsic_ssbo_atomic_umin: + case nir_intrinsic_ssbo_atomic_xor: { + const DataType dType = getDType(insn); + LValues = convert(>dest); + Value *indirectBuffer; + Value *indirectOffset; + uint32_t buffer = getIndirect(>src[0], 0, indirectBuffer); + uint32_t offset = getIndirect(>src[1], 0, indirectOffset); + + Symbol *sym = mkSymbol(FILE_MEMORY_BUFFER, buffer, dType, offset); + Instruction *atom = mkOp2(OP_ATOM, dType, newDefs[0], sym, +getSrc(>src[2], 0)); + if (op == nir_intrinsic_ssbo_atomic_comp_swap) + atom->setSrc(2, getSrc(>src[3], 0)); + atom->setIndirect(0, 0, indirectOffset); + atom->setIndirect(0, 1, indirectBuffer); + atom->subOp = getSubOp(op); + + info->io.globalAccess |= 0x2; + break; + } default: ERROR("unknown nir_intrinsic_op %s\n", nir_intrinsic_infos[op].name); return false; -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 11/34] nv50/ir/nir: add loadFrom and storeTo helpler
v8: don't require C++11 features Signed-off-by: Karol Herbst --- .../nouveau/codegen/nv50_ir_from_nir.cpp | 72 +++ 1 file changed, 72 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp index a0a36d95b41..d3cba9a63c3 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp @@ -93,6 +93,13 @@ private: bool centroid, unsigned semantics); + Instruction *loadFrom(DataFile, uint8_t, DataType, Value *def, uint32_t base, + uint8_t c, Value *indirect0 = NULL, + Value *indirect1 = NULL, bool patch = false); + void storeTo(nir_intrinsic_instr *, DataFile, operation, DataType, +Value *src, uint8_t idx, uint8_t c, Value *indirect0 = NULL, +Value *indirect1 = NULL); + bool isFloatType(nir_alu_type); bool isSignedType(nir_alu_type); bool isResultFloat(nir_op); @@ -969,6 +976,71 @@ Converter::getSlotAddress(nir_intrinsic_instr *insn, uint8_t idx, uint8_t slot) return vary[idx].slot[slot] * 4; } +Instruction * +Converter::loadFrom(DataFile file, uint8_t i, DataType ty, Value *def, +uint32_t base, uint8_t c, Value *indirect0, +Value *indirect1, bool patch) +{ + unsigned int tySize = typeSizeof(ty); + + if (tySize == 8 && + (file == FILE_MEMORY_CONST || file == FILE_MEMORY_BUFFER || indirect0)) { + Value *lo = getSSA(); + Value *hi = getSSA(); + + Instruction *loi = + mkLoad(TYPE_U32, lo, +mkSymbol(file, i, TYPE_U32, base + c * tySize), +indirect0); + loi->setIndirect(0, 1, indirect1); + loi->perPatch = patch; + + Instruction *hii = + mkLoad(TYPE_U32, hi, +mkSymbol(file, i, TYPE_U32, base + c * tySize + 4), +indirect0); + hii->setIndirect(0, 1, indirect1); + hii->perPatch = patch; + + return mkOp2(OP_MERGE, ty, def, lo, hi); + } else { + Instruction *ld = + mkLoad(ty, def, mkSymbol(file, i, ty, base + c * tySize), indirect0); + ld->setIndirect(0, 1, indirect1); + ld->perPatch = patch; + return ld; + } +} + +void +Converter::storeTo(nir_intrinsic_instr *insn, DataFile file, operation op, + DataType ty, Value *src, uint8_t idx, uint8_t c, + Value *indirect0, Value *indirect1) +{ + uint8_t size = typeSizeof(ty); + uint32_t address = getSlotAddress(insn, idx, c); + + if (size == 8 && indirect0) { + Value *split[2]; + mkSplit(split, 4, src); + + if (op == OP_EXPORT) { + split[0] = mkMov(getSSA(), split[0], ty)->getDef(0); + split[1] = mkMov(getSSA(), split[1], ty)->getDef(0); + } + + mkStore(op, TYPE_U32, mkSymbol(file, 0, TYPE_U32, address), indirect0, + split[0])->perPatch = info->out[idx].patch; + mkStore(op, TYPE_U32, mkSymbol(file, 0, TYPE_U32, address + 4), indirect0, + split[1])->perPatch = info->out[idx].patch; + } else { + if (op == OP_EXPORT) + src = mkMov(getSSA(size), src, ty)->getDef(0); + mkStore(op, ty, mkSymbol(file, 0, ty, address), indirect0, + src)->perPatch = info->out[idx].patch; + } +} + bool Converter::run() { -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 15/34] nv50/ir/nir: implement nir_alu_instr handling
v2: user bitfield_insert instead of bfi rework switch helper macros remove some lowering code (LoweringHelper is now used for this) v3: add pack_half_2x16_split add unpack_half_2x16_split_x/y v5: replace first argument with nullptr in loadImm calls prefer getSSA over getScratch v8: fix setting precise modifier for first instruction inside a block add guard in case no instruction gets inserted into an empty block don't require C++11 features v9: use CC_NE for integer compares convert to C++ style comments fix b2f for doubles remove macros around nir ops to make it easier to grep them add handling for fpow Signed-off-by: Karol Herbst --- .../nouveau/codegen/nv50_ir_from_nir.cpp | 562 +- 1 file changed, 561 insertions(+), 1 deletion(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp index a99f3bbbc05..a553e42e08a 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp @@ -114,9 +114,17 @@ private: std::vector getSTypes(nir_alu_instr *); DataType getSType(nir_src &, bool isFloat, bool isSigned); + operation getOperation(nir_op); + operation preOperationNeeded(nir_op); + + int getSubOp(nir_op); + + CondCode getCondCode(nir_op); + bool assignSlots(); bool parseNIR(); + bool visit(nir_alu_instr *); bool visit(nir_block *); bool visit(nir_cf_node *); bool visit(nir_function *); @@ -135,6 +143,7 @@ private: unsigned int curLoopDepth; BasicBlock *exit; + Value *zero; union { struct { @@ -146,7 +155,10 @@ private: Converter::Converter(Program *prog, nir_shader *nir, nv50_ir_prog_info *info) : ConverterCommon(prog, info), nir(nir), - curLoopDepth(0) {} + curLoopDepth(0) +{ + zero = mkImm((uint32_t)0); +} BasicBlock * Converter::convert(nir_block *block) @@ -275,6 +287,191 @@ Converter::getSType(nir_src , bool isFloat, bool isSigned) return ty; } +operation +Converter::getOperation(nir_op op) +{ + switch (op) { + // basic ops with float and int variants + case nir_op_fabs: + case nir_op_iabs: + return OP_ABS; + case nir_op_fadd: + case nir_op_iadd: + return OP_ADD; + case nir_op_fand: + case nir_op_iand: + return OP_AND; + case nir_op_ifind_msb: + case nir_op_ufind_msb: + return OP_BFIND; + case nir_op_fceil: + return OP_CEIL; + case nir_op_fcos: + return OP_COS; + case nir_op_f2f32: + case nir_op_f2f64: + case nir_op_f2i32: + case nir_op_f2i64: + case nir_op_f2u32: + case nir_op_f2u64: + case nir_op_i2f32: + case nir_op_i2f64: + case nir_op_i2i32: + case nir_op_i2i64: + case nir_op_u2f32: + case nir_op_u2f64: + case nir_op_u2u32: + case nir_op_u2u64: + return OP_CVT; + case nir_op_fddx: + case nir_op_fddx_coarse: + case nir_op_fddx_fine: + return OP_DFDX; + case nir_op_fddy: + case nir_op_fddy_coarse: + case nir_op_fddy_fine: + return OP_DFDY; + case nir_op_fdiv: + case nir_op_idiv: + case nir_op_udiv: + return OP_DIV; + case nir_op_fexp2: + return OP_EX2; + case nir_op_ffloor: + return OP_FLOOR; + case nir_op_ffma: + return OP_FMA; + case nir_op_flog2: + return OP_LG2; + case nir_op_fmax: + case nir_op_imax: + case nir_op_umax: + return OP_MAX; + case nir_op_pack_64_2x32_split: + return OP_MERGE; + case nir_op_fmin: + case nir_op_imin: + case nir_op_umin: + return OP_MIN; + case nir_op_fmod: + case nir_op_imod: + case nir_op_umod: + case nir_op_frem: + case nir_op_irem: + return OP_MOD; + case nir_op_fmul: + case nir_op_imul: + case nir_op_imul_high: + case nir_op_umul_high: + return OP_MUL; + case nir_op_fneg: + case nir_op_ineg: + return OP_NEG; + case nir_op_fnot: + case nir_op_inot: + return OP_NOT; + case nir_op_for: + case nir_op_ior: + return OP_OR; + case nir_op_fpow: + return OP_POW; + case nir_op_frcp: + return OP_RCP; + case nir_op_frsq: + return OP_RSQ; + case nir_op_fsat: + return OP_SAT; + case nir_op_feq32: + case nir_op_ieq32: + case nir_op_fge32: + case nir_op_ige32: + case nir_op_uge32: + case nir_op_flt32: + case nir_op_ilt32: + case nir_op_ult32: + case nir_op_fne32: + case nir_op_ine32: + return OP_SET; + case nir_op_ishl: + return OP_SHL; + case nir_op_ishr: + case nir_op_ushr: + return OP_SHR; + case nir_op_fsin: + return OP_SIN; + case nir_op_fsqrt: + return OP_SQRT; + case nir_op_fsub: + case nir_op_isub: + return OP_SUB; + case nir_op_ftrunc: + return OP_TRUNC; + case nir_op_fxor: + case nir_op_ixor: + return OP_XOR; + default: + ERROR("couldn't get operation for op %s\n", nir_op_infos[op].name); + assert(false); + return
[Mesa-dev] [PATCH 10/34] nv50/ir/nir: run assignSlots
v2: add support for geometry shaders set idx add some missing mappings fix for 64bit inputs/outputs fix up some FP color output index messup parse centroid flag v3: fix arrays in outputs as well fix input/ouput size calculation for tessellation shaders v4: add getSlotAddress helper fix for 64 bit typed inputs v5: change getSlotAddress interface for easier use fix sample inputs fix slot counting for mat v7: fix driver_location of images v8: don't require C++11 features v9: convert to C++ style comments support VERT_ATTRIB_POINT_SIZE add more error checking to slots Signed-off-by: Karol Herbst --- .../nouveau/codegen/nv50_ir_from_nir.cpp | 642 ++ 1 file changed, 642 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp index 2ac6d8c1d07..a0a36d95b41 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp @@ -86,6 +86,13 @@ private: uint32_t getIndirect(nir_src *, uint8_t, Value *&); uint32_t getIndirect(nir_intrinsic_instr *, uint8_t s, uint8_t c, Value *&); + uint32_t getSlotAddress(nir_intrinsic_instr *, uint8_t idx, uint8_t slot); + + void setInterpolate(nv50_ir_varying *, + uint8_t, + bool centroid, + unsigned semantics); + bool isFloatType(nir_alu_type); bool isSignedType(nir_alu_type); bool isResultFloat(nir_op); @@ -98,6 +105,8 @@ private: std::vector getSTypes(nir_alu_instr *); DataType getSType(nir_src &, bool isFloat, bool isSigned); + bool assignSlots(); + nir_shader *nir; NirDefMap ssaDefs; @@ -332,6 +341,634 @@ Converter::getIndirect(nir_intrinsic_instr *insn, uint8_t s, uint8_t c, Value *& return idx; } +static void +vert_attrib_to_tgsi_semantic(gl_vert_attrib slot, unsigned *name, unsigned *index) +{ + assert(name && index); + + if (slot >= VERT_ATTRIB_MAX) { + ERROR("invalid varying slot %u\n", slot); + assert(false); + return; + } + + if (slot >= VERT_ATTRIB_GENERIC0 && + slot < VERT_ATTRIB_GENERIC0 + VERT_ATTRIB_GENERIC_MAX) { + *name = TGSI_SEMANTIC_GENERIC; + *index = slot - VERT_ATTRIB_GENERIC0; + return; + } + + if (slot >= VERT_ATTRIB_TEX0 && + slot < VERT_ATTRIB_TEX0 + VERT_ATTRIB_TEX_MAX) { + *name = TGSI_SEMANTIC_TEXCOORD; + *index = slot - VERT_ATTRIB_TEX0; + return; + } + + switch (slot) { + case VERT_ATTRIB_COLOR0: + *name = TGSI_SEMANTIC_COLOR; + *index = 0; + break; + case VERT_ATTRIB_COLOR1: + *name = TGSI_SEMANTIC_COLOR; + *index = 1; + break; + case VERT_ATTRIB_EDGEFLAG: + *name = TGSI_SEMANTIC_EDGEFLAG; + *index = 0; + break; + case VERT_ATTRIB_FOG: + *name = TGSI_SEMANTIC_FOG; + *index = 0; + break; + case VERT_ATTRIB_NORMAL: + *name = TGSI_SEMANTIC_NORMAL; + *index = 0; + break; + case VERT_ATTRIB_POS: + *name = TGSI_SEMANTIC_POSITION; + *index = 0; + break; + case VERT_ATTRIB_POINT_SIZE: + *name = TGSI_SEMANTIC_PSIZE; + *index = 0; + break; + default: + ERROR("unknown vert attrib slot %u\n", slot); + assert(false); + break; + } +} + +static void +varying_slot_to_tgsi_semantic(gl_varying_slot slot, unsigned *name, unsigned *index) +{ + assert(name && index); + + if (slot >= VARYING_SLOT_TESS_MAX) { + ERROR("invalid varying slot %u\n", slot); + assert(false); + return; + } + + if (slot >= VARYING_SLOT_PATCH0) { + *name = TGSI_SEMANTIC_PATCH; + *index = slot - VARYING_SLOT_PATCH0; + return; + } + + if (slot >= VARYING_SLOT_VAR0) { + *name = TGSI_SEMANTIC_GENERIC; + *index = slot - VARYING_SLOT_VAR0; + return; + } + + if (slot >= VARYING_SLOT_TEX0 && slot <= VARYING_SLOT_TEX7) { + *name = TGSI_SEMANTIC_TEXCOORD; + *index = slot - VARYING_SLOT_TEX0; + return; + } + + switch (slot) { + case VARYING_SLOT_BFC0: + *name = TGSI_SEMANTIC_BCOLOR; + *index = 0; + break; + case VARYING_SLOT_BFC1: + *name = TGSI_SEMANTIC_BCOLOR; + *index = 1; + break; + case VARYING_SLOT_CLIP_DIST0: + *name = TGSI_SEMANTIC_CLIPDIST; + *index = 0; + break; + case VARYING_SLOT_CLIP_DIST1: + *name = TGSI_SEMANTIC_CLIPDIST; + *index = 1; + break; + case VARYING_SLOT_CLIP_VERTEX: + *name = TGSI_SEMANTIC_CLIPVERTEX; + *index = 0; + break; + case VARYING_SLOT_COL0: + *name = TGSI_SEMANTIC_COLOR; + *index = 0; + break; + case VARYING_SLOT_COL1: + *name = TGSI_SEMANTIC_COLOR; + *index = 1; + break; + case VARYING_SLOT_EDGE: + *name = TGSI_SEMANTIC_EDGEFLAG; + *index = 0; + break; + case VARYING_SLOT_FACE: + *name =
[Mesa-dev] [PATCH 27/34] nv50/ir/nir: implement nir_intrinsic_load_ubo
v4: use loadFrom helper v8: don't require C++11 features Signed-off-by: Karol Herbst --- .../drivers/nouveau/codegen/nv50_ir_from_nir.cpp | 14 ++ 1 file changed, 14 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp index 991c1283a0f..11403bea674 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp @@ -2013,6 +2013,20 @@ Converter::visit(nir_intrinsic_instr *insn) mkOp1(getOperation(op), TYPE_U32, NULL, mkImm(idx))->fixed = 1; break; } + case nir_intrinsic_load_ubo: { + const DataType dType = getDType(insn); + LValues = convert(>dest); + Value *indirectIndex; + Value *indirectOffset; + uint32_t index = getIndirect(>src[0], 0, indirectIndex) + 1; + uint32_t offset = getIndirect(>src[1], 0, indirectOffset); + + for (uint8_t i = 0u; i < insn->num_components; ++i) { + loadFrom(FILE_MEMORY_CONST, index, dType, newDefs[i], offset, i, + indirectOffset, indirectIndex); + } + break; + } default: ERROR("unknown nir_intrinsic_op %s\n", nir_intrinsic_infos[op].name); return false; -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 31/34] nv50/ir/nir: implement load_per_vertex_output
v4: use smarter getIndirect helper use new getSlotAddress helper v5: use loadFrom helper v8: don't require C++11 features Signed-off-by: Karol Herbst --- .../nouveau/codegen/nv50_ir_from_nir.cpp | 23 +++ 1 file changed, 23 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp index ad68fb4505f..c379eb72c1e 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp @@ -2163,6 +2163,29 @@ Converter::visit(nir_intrinsic_instr *insn) } break; } + case nir_intrinsic_load_per_vertex_output: { + const DataType dType = getDType(insn); + LValues = convert(>dest); + Value *indirectVertex; + Value *indirectOffset; + uint32_t baseVertex = getIndirect(>src[0], 0, indirectVertex); + uint32_t idx = getIndirect(insn, 1, 0, indirectOffset); + Value *vtxBase = NULL; + + if (indirectVertex) + vtxBase = indirectVertex; + else + vtxBase = loadImm(NULL, baseVertex); + + vtxBase = mkOp2v(OP_ADD, TYPE_U32, getSSA(4, FILE_ADDRESS), outBase, vtxBase); + + for (uint8_t i = 0u; i < insn->num_components; ++i) { + uint32_t address = getSlotAddress(insn, idx, i); + loadFrom(FILE_SHADER_OUTPUT, 0, dType, newDefs[i], address, 0, + indirectOffset, vtxBase, info->in[idx].patch); + } + break; + } case nir_intrinsic_emit_vertex: case nir_intrinsic_end_primitive: { uint32_t idx = nir_intrinsic_stream_id(insn); -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 02/34] nvc0: print the shader type when dumping headers
this makes debugging the shader header a little easier Acked-by: Pierre Moreau Signed-off-by: Karol Herbst --- src/gallium/drivers/nouveau/nvc0/nvc0_program.c | 1 + 1 file changed, 1 insertion(+) diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_program.c b/src/gallium/drivers/nouveau/nvc0/nvc0_program.c index 1bbfa4a9428..008b660b8c0 100644 --- a/src/gallium/drivers/nouveau/nvc0/nvc0_program.c +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_program.c @@ -554,6 +554,7 @@ nvc0_program_dump(struct nvc0_program *prog) unsigned pos; if (prog->type != PIPE_SHADER_COMPUTE) { + debug_printf("dumping HDR for type %i\n", prog->type); for (pos = 0; pos < ARRAY_SIZE(prog->hdr); ++pos) debug_printf("HDR[%02"PRIxPTR"] = 0x%08x\n", pos * sizeof(prog->hdr[0]), prog->hdr[pos]); -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 06/34] nouveau: fix nir and TGSI shader cache collision
v9: rename variable to driver_flags use constants for shader cache flags Signed-off-by: Karol Herbst Reviewed-by: Pierre Moreau --- src/gallium/drivers/nouveau/nouveau_screen.c | 8 +++- src/gallium/drivers/nouveau/nouveau_screen.h | 3 +++ 2 files changed, 10 insertions(+), 1 deletion(-) diff --git a/src/gallium/drivers/nouveau/nouveau_screen.c b/src/gallium/drivers/nouveau/nouveau_screen.c index 98b44b7df0b..cbd45a1dc35 100644 --- a/src/gallium/drivers/nouveau/nouveau_screen.c +++ b/src/gallium/drivers/nouveau/nouveau_screen.c @@ -151,6 +151,7 @@ nouveau_disk_cache_create(struct nouveau_screen *screen) struct mesa_sha1 ctx; unsigned char sha1[20]; char cache_id[20 * 2 + 1]; + uint64_t driver_flags = 0; _mesa_sha1_init(); if (!disk_cache_get_function_identifier(nouveau_disk_cache_create, @@ -160,9 +161,14 @@ nouveau_disk_cache_create(struct nouveau_screen *screen) _mesa_sha1_final(, sha1); disk_cache_format_hex_id(cache_id, sha1, 20 * 2); + if (screen->prefer_nir) + driver_flags |= NOUVEAU_SHADER_CACHE_FLAGS_IR_NIR; + else + driver_flags |= NOUVEAU_SHADER_CACHE_FLAGS_IR_TGSI; + screen->disk_shader_cache = disk_cache_create(nouveau_screen_get_name(>base), -cache_id, 0); +cache_id, driver_flags); } int diff --git a/src/gallium/drivers/nouveau/nouveau_screen.h b/src/gallium/drivers/nouveau/nouveau_screen.h index 4598d6a60e3..1302c608bec 100644 --- a/src/gallium/drivers/nouveau/nouveau_screen.h +++ b/src/gallium/drivers/nouveau/nouveau_screen.h @@ -17,6 +17,9 @@ extern int nouveau_mesa_debug; struct nouveau_bo; +#define NOUVEAU_SHADER_CACHE_FLAGS_IR_TGSI 0 << 0 +#define NOUVEAU_SHADER_CACHE_FLAGS_IR_NIR 1 << 0 + struct nouveau_screen { struct pipe_screen base; struct nouveau_drm *drm; -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 07/34] nv50/ir/nir: run some passes to make the conversion easier
v2: add constant_folding v6: print non final NIR only for verbose debugging v8: add passes we will need for OpenCL compute shaders v9: move type_size into anonymous namespace convert to C++ style comments lower bools to int32 Signed-off-by: Karol Herbst Acked-by: Pierre Moreau --- .../nouveau/codegen/nv50_ir_from_nir.cpp | 42 +++ 1 file changed, 42 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp index b22c62fd434..9d2d16b97f5 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp @@ -35,6 +35,12 @@ namespace { using namespace nv50_ir; +int +type_size(const struct glsl_type *type) +{ + return glsl_count_attribute_slots(type, false); +} + class Converter : public ConverterCommon { public: @@ -52,6 +58,42 @@ Converter::Converter(Program *prog, nir_shader *nir, nv50_ir_prog_info *info) bool Converter::run() { + bool progress; + + if (prog->dbgFlags & NV50_IR_DEBUG_VERBOSE) + nir_print_shader(nir, stderr); + + NIR_PASS_V(nir, nir_lower_io, nir_var_all, type_size, (nir_lower_io_options)0); + NIR_PASS_V(nir, nir_lower_regs_to_ssa); + NIR_PASS_V(nir, nir_lower_load_const_to_scalar); + NIR_PASS_V(nir, nir_lower_vars_to_ssa); + NIR_PASS_V(nir, nir_lower_alu_to_scalar); + NIR_PASS_V(nir, nir_lower_phis_to_scalar); + + do { + progress = false; + NIR_PASS(progress, nir, nir_copy_prop); + NIR_PASS(progress, nir, nir_opt_remove_phis); + NIR_PASS(progress, nir, nir_opt_trivial_continues); + NIR_PASS(progress, nir, nir_opt_cse); + NIR_PASS(progress, nir, nir_opt_algebraic); + NIR_PASS(progress, nir, nir_opt_constant_folding); + NIR_PASS(progress, nir, nir_copy_prop); + NIR_PASS(progress, nir, nir_opt_dce); + NIR_PASS(progress, nir, nir_opt_dead_cf); + } while (progress); + + NIR_PASS_V(nir, nir_lower_bool_to_int32); + NIR_PASS_V(nir, nir_lower_locals_to_regs); + NIR_PASS_V(nir, nir_remove_dead_variables, nir_var_function_temp); + NIR_PASS_V(nir, nir_convert_from_ssa, true); + + // Garbage collect dead instructions + nir_sweep(nir); + + if (prog->dbgFlags & NV50_IR_DEBUG_BASIC) + nir_print_shader(nir, stderr); + return false; } -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 08/34] nv50/ir/nir: track defs and provide easy access functions
v2: add helper function for indirects v4: add new getIndirect overload for easier use v5: use getSSA for ssa values we can just create the values for unassigned registers in getSrc v6: always create at least 32 bit values v8: don't require C++11 features v9: include unordered_map on supported stdlibs replace '(*it).' with 'it->' Signed-off-by: Karol Herbst --- .../nouveau/codegen/nv50_ir_from_nir.cpp | 150 ++ 1 file changed, 150 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp index 9d2d16b97f5..f7908876e96 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp @@ -31,8 +31,23 @@ #include "codegen/nv50_ir_lowering_helper.h" #include "codegen/nv50_ir_util.h" +#if __cplusplus >= 201103L +#include +#else +#include +#endif +#include + namespace { +#if __cplusplus >= 201103L +using std::hash; +using std::unordered_map; +#else +using std::tr1::hash; +using std::tr1::unordered_map; +#endif + using namespace nv50_ir; int @@ -48,13 +63,148 @@ public: bool run(); private: + typedef std::vector LValues; + typedef unordered_map NirDefMap; + + LValues& convert(nir_alu_dest *); + LValues& convert(nir_dest *); + LValues& convert(nir_register *); + LValues& convert(nir_ssa_def *); + + Value* getSrc(nir_alu_src *, uint8_t component = 0); + Value* getSrc(nir_register *, uint8_t); + Value* getSrc(nir_src *, uint8_t, bool indirect = false); + Value* getSrc(nir_ssa_def *, uint8_t); + + // returned value is the constant part of the given source (either the + // nir_src or the selected source component of an intrinsic). Even though + // this is mostly an optimization to be able to skip indirects in a few + // cases, sometimes we require immediate values or set some fileds on + // instructions (e.g. tex) in order for codegen to consume those. + // If the found value has not a constant part, the Value gets returned + // through the Value parameter. + uint32_t getIndirect(nir_src *, uint8_t, Value *&); + uint32_t getIndirect(nir_intrinsic_instr *, uint8_t s, uint8_t c, Value *&); + nir_shader *nir; + + NirDefMap ssaDefs; + NirDefMap regDefs; }; Converter::Converter(Program *prog, nir_shader *nir, nv50_ir_prog_info *info) : ConverterCommon(prog, info), nir(nir) {} +Converter::LValues& +Converter::convert(nir_dest *dest) +{ + if (dest->is_ssa) + return convert(>ssa); + if (dest->reg.indirect) { + ERROR("no support for indirects."); + assert(false); + } + return convert(dest->reg.reg); +} + +Converter::LValues& +Converter::convert(nir_register *reg) +{ + NirDefMap::iterator it = regDefs.find(reg->index); + if (it != regDefs.end()) + return it->second; + + LValues newDef(reg->num_components); + for (uint8_t i = 0; i < reg->num_components; i++) + newDef[i] = getScratch(std::max(4, reg->bit_size / 8)); + return regDefs[reg->index] = newDef; +} + +Converter::LValues& +Converter::convert(nir_ssa_def *def) +{ + NirDefMap::iterator it = ssaDefs.find(def->index); + if (it != ssaDefs.end()) + return it->second; + + LValues newDef(def->num_components); + for (uint8_t i = 0; i < def->num_components; i++) + newDef[i] = getSSA(std::max(4, def->bit_size / 8)); + return ssaDefs[def->index] = newDef; +} + +Value* +Converter::getSrc(nir_alu_src *src, uint8_t component) +{ + if (src->abs || src->negate) { + ERROR("modifiers currently not supported on nir_alu_src\n"); + assert(false); + } + return getSrc(>src, src->swizzle[component]); +} + +Value* +Converter::getSrc(nir_register *reg, uint8_t idx) +{ + NirDefMap::iterator it = regDefs.find(reg->index); + if (it == regDefs.end()) + return convert(reg)[idx]; + return it->second[idx]; +} + +Value* +Converter::getSrc(nir_src *src, uint8_t idx, bool indirect) +{ + if (src->is_ssa) + return getSrc(src->ssa, idx); + + if (src->reg.indirect) { + if (indirect) + return getSrc(src->reg.indirect, idx); + ERROR("no support for indirects."); + assert(false); + return NULL; + } + + return getSrc(src->reg.reg, idx); +} + +Value* +Converter::getSrc(nir_ssa_def *src, uint8_t idx) +{ + NirDefMap::iterator it = ssaDefs.find(src->index); + if (it == ssaDefs.end()) { + ERROR("SSA value %u not found\n", src->index); + assert(false); + return NULL; + } + return it->second[idx]; +} + +uint32_t +Converter::getIndirect(nir_src *src, uint8_t idx, Value *) +{ + nir_const_value *offset = nir_src_as_const_value(*src); + + if (offset) { + indirect = NULL; + return offset->u32[0]; + } + + indirect = getSrc(src, idx, true); + return 0; +} + +uint32_t +Converter::getIndirect(nir_intrinsic_instr *insn, uint8_t s, uint8_t c, Value *) +{ + int32_t idx = nir_intrinsic_base(insn) +
[Mesa-dev] [PATCH 05/34] nouveau: add support for nir
not all those nir options are actually required, it just made the work a little easier. v2: fix asserts parse compute shaders don't lower bitfield_insert v3: fix memory leak v4: don't lower fmod32 v5: set lower_all_io_to_temps to false fix memory leak because we take over ownership of the nir shader merge: use the lowering helper v6: include TGSI debug header for proper assert call add nv50 support v7: fix Automake build v8: free shader only for the set shader type v9: check for IR type inside get_compiler_options squash "nouveau: add env var to make nir default" fix memory leak when creating compute shaders use debug_get_bool_option as it is available in non debug builds return failure if unsupported IR is encountered don't lower fpow in nir Signed-off-by: Karol Herbst Reviewed-by: Pierre Moreau --- src/gallium/drivers/nouveau/Automake.inc | 3 + src/gallium/drivers/nouveau/Makefile.am | 5 + src/gallium/drivers/nouveau/Makefile.sources | 1 + .../drivers/nouveau/codegen/nv50_ir.cpp | 3 + src/gallium/drivers/nouveau/codegen/nv50_ir.h | 1 + .../nouveau/codegen/nv50_ir_from_nir.cpp | 76 +++ src/gallium/drivers/nouveau/meson.build | 9 +- src/gallium/drivers/nouveau/nouveau_screen.c | 2 + src/gallium/drivers/nouveau/nouveau_screen.h | 2 + .../drivers/nouveau/nv50/nv50_program.c | 19 +++- .../drivers/nouveau/nv50/nv50_screen.c| 46 - src/gallium/drivers/nouveau/nv50/nv50_state.c | 35 ++- .../drivers/nouveau/nvc0/nvc0_program.c | 18 +++- .../drivers/nouveau/nvc0/nvc0_screen.c| 94 +-- src/gallium/drivers/nouveau/nvc0/nvc0_state.c | 31 +- 15 files changed, 323 insertions(+), 22 deletions(-) create mode 100644 src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp diff --git a/src/gallium/drivers/nouveau/Automake.inc b/src/gallium/drivers/nouveau/Automake.inc index 1d383fcb7b1..657790494dc 100644 --- a/src/gallium/drivers/nouveau/Automake.inc +++ b/src/gallium/drivers/nouveau/Automake.inc @@ -8,4 +8,7 @@ TARGET_LIB_DEPS += \ $(NOUVEAU_LIBS) \ $(LIBDRM_LIBS) +TARGET_COMPILER_LIB_DEPS = \ + $(top_builddir)/src/compiler/nir/libnir.la + endif diff --git a/src/gallium/drivers/nouveau/Makefile.am b/src/gallium/drivers/nouveau/Makefile.am index 48c0fdf512d..ee7191675cc 100644 --- a/src/gallium/drivers/nouveau/Makefile.am +++ b/src/gallium/drivers/nouveau/Makefile.am @@ -25,6 +25,10 @@ include $(top_srcdir)/src/gallium/Automake.inc AM_CPPFLAGS = \ -I$(top_srcdir)/include \ + -I$(top_builddir)/src/compiler/nir \ + -I$(top_srcdir)/src/compiler/nir \ + -I$(top_srcdir)/src/mapi \ + -I$(top_srcdir)/src/mesa \ $(GALLIUM_DRIVER_CFLAGS) \ $(LIBDRM_CFLAGS) \ $(NOUVEAU_CFLAGS) @@ -47,6 +51,7 @@ nouveau_compiler_SOURCES = \ nouveau_compiler_LDADD = \ libnouveau.la \ + $(top_builddir)/src/compiler/nir/libnir.la \ $(top_builddir)/src/gallium/auxiliary/libgallium.la \ $(top_builddir)/src/util/libmesautil.la \ $(GALLIUM_COMMON_LIB_DEPS) diff --git a/src/gallium/drivers/nouveau/Makefile.sources b/src/gallium/drivers/nouveau/Makefile.sources index ec344c63169..c6a1aff7110 100644 --- a/src/gallium/drivers/nouveau/Makefile.sources +++ b/src/gallium/drivers/nouveau/Makefile.sources @@ -117,6 +117,7 @@ NV50_CODEGEN_SOURCES := \ codegen/nv50_ir_emit_nv50.cpp \ codegen/nv50_ir_from_common.cpp \ codegen/nv50_ir_from_common.h \ + codegen/nv50_ir_from_nir.cpp \ codegen/nv50_ir_from_tgsi.cpp \ codegen/nv50_ir_graph.cpp \ codegen/nv50_ir_graph.h \ diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir.cpp index 993d01c1e44..a181a13a3b1 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir.cpp @@ -1241,6 +1241,9 @@ nv50_ir_generate_code(struct nv50_ir_prog_info *info) prog->optLevel = info->optLevel; switch (info->bin.sourceRep) { + case PIPE_SHADER_IR_NIR: + ret = prog->makeFromNIR(info) ? 0 : -2; + break; case PIPE_SHADER_IR_TGSI: ret = prog->makeFromTGSI(info) ? 0 : -2; break; diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir.h b/src/gallium/drivers/nouveau/codegen/nv50_ir.h index 8d32a25ec23..b19751ab372 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir.h +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir.h @@ -1284,6 +1284,7 @@ public: inline void del(Function *fn, int& id) { allFuncs.remove(id); } inline void add(Value *rval, int& id) { allRValues.insert(rval, id); } + bool makeFromNIR(struct nv50_ir_prog_info *); bool makeFromTGSI(struct nv50_ir_prog_info *); bool convertToSSA(); bool optimizeSSA(int level); diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
[Mesa-dev] [PATCH 04/34] nv50/ir: add lowering helper
if we start supporting multiple input IRs we might want to move lowering code into a common place and keep the initial translation simplier. This will also allows us to react on ISA changes more easily. v5: also handle SAT v6: rename type variables fixed lowering of NEG add lowering of NOT v8: don't require C++11 features Signed-off-by: Karol Herbst Reviewed-by: Pierre Moreau --- src/gallium/drivers/nouveau/Makefile.sources | 2 + .../codegen/nv50_ir_lowering_helper.cpp | 275 ++ .../nouveau/codegen/nv50_ir_lowering_helper.h | 53 src/gallium/drivers/nouveau/meson.build | 2 + 4 files changed, 332 insertions(+) create mode 100644 src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_helper.cpp create mode 100644 src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_helper.h diff --git a/src/gallium/drivers/nouveau/Makefile.sources b/src/gallium/drivers/nouveau/Makefile.sources index fee5e59522e..ec344c63169 100644 --- a/src/gallium/drivers/nouveau/Makefile.sources +++ b/src/gallium/drivers/nouveau/Makefile.sources @@ -122,6 +122,8 @@ NV50_CODEGEN_SOURCES := \ codegen/nv50_ir_graph.h \ codegen/nv50_ir.h \ codegen/nv50_ir_inlines.h \ + codegen/nv50_ir_lowering_helper.cpp \ + codegen/nv50_ir_lowering_helper.h \ codegen/nv50_ir_lowering_nv50.cpp \ codegen/nv50_ir_peephole.cpp \ codegen/nv50_ir_print.cpp \ diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_helper.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_helper.cpp new file mode 100644 index 000..02380f12b9f --- /dev/null +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_helper.cpp @@ -0,0 +1,275 @@ +/* + * Copyright 2018 Red Hat Inc. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + * + * Authors: Karol Herbst + */ + +#include "codegen/nv50_ir_lowering_helper.h" + +namespace nv50_ir { + +bool +LoweringHelper::visit(Instruction *insn) +{ + switch (insn->op) { + case OP_ABS: + return handleABS(insn); + case OP_CVT: + return handleCVT(insn); + case OP_MAX: + case OP_MIN: + return handleMAXMIN(insn); + case OP_MOV: + return handleMOV(insn); + case OP_NEG: + return handleNEG(insn); + case OP_SAT: + return handleSAT(insn); + case OP_SLCT: + return handleSLCT(insn->asCmp()); + case OP_AND: + case OP_NOT: + case OP_OR: + case OP_XOR: + return handleLogOp(insn); + default: + return true; + } +} + +bool +LoweringHelper::handleABS(Instruction *insn) +{ + DataType dTy = insn->dType; + if (!(dTy == TYPE_U64 || dTy == TYPE_S64)) + return true; + + bld.setPosition(insn, false); + + Value *neg = bld.getSSA(8); + Value *negComp[2], *srcComp[2]; + Value *lo = bld.getSSA(), *hi = bld.getSSA(); + bld.mkOp2(OP_SUB, dTy, neg, bld.mkImm((uint64_t)0), insn->getSrc(0)); + bld.mkSplit(negComp, 4, neg); + bld.mkSplit(srcComp, 4, insn->getSrc(0)); + bld.mkCmp(OP_SLCT, CC_LT, TYPE_S32, lo, TYPE_S32, negComp[0], srcComp[0], srcComp[1]); + bld.mkCmp(OP_SLCT, CC_LT, TYPE_S32, hi, TYPE_S32, negComp[1], srcComp[1], srcComp[1]); + insn->op = OP_MERGE; + insn->setSrc(0, lo); + insn->setSrc(1, hi); + + return true; +} + +bool +LoweringHelper::handleCVT(Instruction *insn) +{ + DataType dTy = insn->dType; + DataType sTy = insn->sType; + + if (typeSizeof(dTy) <= 4 && typeSizeof(sTy) <= 4) + return true; + + bld.setPosition(insn, false); + + if ((dTy == TYPE_S32 && sTy == TYPE_S64) || + (dTy == TYPE_U32 && sTy == TYPE_U64)) { + Value *src[2]; + bld.mkSplit(src, 4, insn->getSrc(0)); + insn->op = OP_MOV; + insn->setSrc(0, src[0]); + } else if (dTy == TYPE_S64 && sTy == TYPE_S32) { + Value *tmp = bld.getSSA(); + bld.mkOp2(OP_SHR, TYPE_S32, tmp, insn->getSrc(0), bld.loadImm(bld.getSSA(), 31)); + insn->op = OP_MERGE; + insn->setSrc(1, tmp);
[Mesa-dev] [PATCH 01/34] prog_to_nir: fix write from vps to PSIZ
Point size is a single component value and drivers might write the full vec4 potentially overwriting other values. Signed-off-by: Karol Herbst --- src/mesa/program/prog_to_nir.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/src/mesa/program/prog_to_nir.c b/src/mesa/program/prog_to_nir.c index cb1c19e9dfa..7d17e1da48a 100644 --- a/src/mesa/program/prog_to_nir.c +++ b/src/mesa/program/prog_to_nir.c @@ -868,7 +868,8 @@ ptn_add_output_stores(struct ptn_compile *c) src = nir_channel(b, src, 2); } if (c->prog->Target == GL_VERTEX_PROGRAM_ARB && - var->data.location == VARYING_SLOT_FOGC) { + (var->data.location == VARYING_SLOT_FOGC || + var->data.location == VARYING_SLOT_PSIZ)) { /* result.fogcoord is a single component value */ src = nir_channel(b, src, 0); } @@ -956,7 +957,8 @@ setup_registers_and_variables(struct ptn_compile *c) nir_variable *var = rzalloc(shader, nir_variable); if ((c->prog->Target == GL_FRAGMENT_PROGRAM_ARB && i == FRAG_RESULT_DEPTH) || - (c->prog->Target == GL_VERTEX_PROGRAM_ARB && i == VARYING_SLOT_FOGC)) + (c->prog->Target == GL_VERTEX_PROGRAM_ARB && + (i == VARYING_SLOT_FOGC || i == VARYING_SLOT_PSIZ))) var->type = glsl_float_type(); else var->type = glsl_vec4_type(); -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 03/34] nv50/ir: move common converter code in base class
v2: remove TGSI related bits Signed-off-by: Karol Herbst Reviewed-by: Pierre Moreau --- src/gallium/drivers/nouveau/Makefile.sources | 2 + .../nouveau/codegen/nv50_ir_from_common.cpp | 107 ++ .../nouveau/codegen/nv50_ir_from_common.h | 58 ++ .../nouveau/codegen/nv50_ir_from_tgsi.cpp | 106 + src/gallium/drivers/nouveau/meson.build | 2 + 5 files changed, 172 insertions(+), 103 deletions(-) create mode 100644 src/gallium/drivers/nouveau/codegen/nv50_ir_from_common.cpp create mode 100644 src/gallium/drivers/nouveau/codegen/nv50_ir_from_common.h diff --git a/src/gallium/drivers/nouveau/Makefile.sources b/src/gallium/drivers/nouveau/Makefile.sources index 65f08c7d8d8..fee5e59522e 100644 --- a/src/gallium/drivers/nouveau/Makefile.sources +++ b/src/gallium/drivers/nouveau/Makefile.sources @@ -115,6 +115,8 @@ NV50_CODEGEN_SOURCES := \ codegen/nv50_ir_build_util.h \ codegen/nv50_ir_driver.h \ codegen/nv50_ir_emit_nv50.cpp \ + codegen/nv50_ir_from_common.cpp \ + codegen/nv50_ir_from_common.h \ codegen/nv50_ir_from_tgsi.cpp \ codegen/nv50_ir_graph.cpp \ codegen/nv50_ir_graph.h \ diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_common.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_common.cpp new file mode 100644 index 000..0ad6087e588 --- /dev/null +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_common.cpp @@ -0,0 +1,107 @@ +/* + * Copyright 2011 Christoph Bumiller + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + */ + +#include "codegen/nv50_ir_from_common.h" + +namespace nv50_ir { + +ConverterCommon::ConverterCommon(Program *prog, nv50_ir_prog_info *info) + : BuildUtil(prog), + info(info) {} + +ConverterCommon::Subroutine * +ConverterCommon::getSubroutine(unsigned ip) +{ + std::map::iterator it = sub.map.find(ip); + + if (it == sub.map.end()) + it = sub.map.insert(std::make_pair( + ip, Subroutine(new Function(prog, "SUB", ip.first; + + return >second; +} + +ConverterCommon::Subroutine * +ConverterCommon::getSubroutine(Function *f) +{ + unsigned ip = f->getLabel(); + std::map::iterator it = sub.map.find(ip); + + if (it == sub.map.end()) + it = sub.map.insert(std::make_pair(ip, Subroutine(f))).first; + + return >second; +} + +uint8_t +ConverterCommon::translateInterpMode(const struct nv50_ir_varying *var, operation& op) +{ + uint8_t mode = NV50_IR_INTERP_PERSPECTIVE; + + if (var->flat) + mode = NV50_IR_INTERP_FLAT; + else + if (var->linear) + mode = NV50_IR_INTERP_LINEAR; + else + if (var->sc) + mode = NV50_IR_INTERP_SC; + + op = (mode == NV50_IR_INTERP_PERSPECTIVE || mode == NV50_IR_INTERP_SC) + ? OP_PINTERP : OP_LINTERP; + + if (var->centroid) + mode |= NV50_IR_INTERP_CENTROID; + + return mode; +} + +void +ConverterCommon::handleUserClipPlanes() +{ + Value *res[8]; + int n, i, c; + + for (c = 0; c < 4; ++c) { + for (i = 0; i < info->io.genUserClip; ++i) { + Symbol *sym = mkSymbol(FILE_MEMORY_CONST, info->io.auxCBSlot, +TYPE_F32, info->io.ucpBase + i * 16 + c * 4); + Value *ucp = mkLoadv(TYPE_F32, sym, NULL); + if (c == 0) +res[i] = mkOp2v(OP_MUL, TYPE_F32, getScratch(), clipVtx[c], ucp); + else +mkOp3(OP_MAD, TYPE_F32, res[i], clipVtx[c], ucp, res[i]); + } + } + + const int first = info->numOutputs - (info->io.genUserClip + 3) / 4; + + for (i = 0; i < info->io.genUserClip; ++i) { + n = i / 4 + first; + c = i % 4; + Symbol *sym = + mkSymbol(FILE_SHADER_OUTPUT, 0, TYPE_F32, info->out[n].slot[c] * 4); + mkStore(OP_EXPORT, TYPE_F32, sym, NULL, res[i]); + } +} + +} // namespace nv50_ir diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_common.h
[Mesa-dev] [PATCH 00/34] Nouveau NIR patches
I think I will just go ahead and merge it over the next few days. There are more patches to fix some crashes related to 64 bit types, but that is touching current RA code and I would like to have better testing there. So some piglit regressions, most related to 64 bit types though covered by the spoken of patch. Can be enabled by setting NV50_PROG_USE_NIR=1 Chanelogs attached to the patches directly and nothing really new since the last time I posted that series, mostly just fixing compatibility with master. Have fun. Karol Herbst (34): prog_to_nir: fix write from vps to PSIZ nvc0: print the shader type when dumping headers nv50/ir: move common converter code in base class nv50/ir: add lowering helper nouveau: add support for nir nouveau: fix nir and TGSI shader cache collision nv50/ir/nir: run some passes to make the conversion easier nv50/ir/nir: track defs and provide easy access functions nv50/ir/nir: add nir type helper functions nv50/ir/nir: run assignSlots nv50/ir/nir: add loadFrom and storeTo helpler nv50/ir/nir: parse NIR shader info nv50/ir/nir: implement nir_load_const_instr nv50/ir/nir: add skeleton for nir_intrinsic_instr nv50/ir/nir: implement nir_alu_instr handling nv50/ir/nir: implement nir_intrinsic_load_uniform nv50/ir/nir: implement nir_intrinsic_store_(per_vertex_)output nv50/ir/nir: implement load_(interpolated_)input/output nv50/ir/nir: implement intrinsic_discard(_if) nv50/ir/nir: implement loading system values nv50/ir/nir: implement nir_ssa_undef_instr nv50/ir/nir: implement nir_instr_type_tex nv50/ir/nir: add skeleton getOperation for intrinsics nv50/ir/nir: implement vote and ballot nv50/ir/nir: implement variable indexing nv50/ir/nir: implement geometry shader nir_intrinsics nv50/ir/nir: implement nir_intrinsic_load_ubo nv50/ir/nir: implement ssbo intrinsics nv50/ir/nir: implement images nv50/ir/nir: add memory barriers nv50/ir/nir: implement load_per_vertex_output nv50/ir/nir: implement intrinsic shader_clock nv50/ir/nir: handle user clip planes for each emitted vertex nv50ir/nir: move immediates before use src/gallium/drivers/nouveau/Automake.inc |3 + src/gallium/drivers/nouveau/Makefile.am |5 + src/gallium/drivers/nouveau/Makefile.sources |5 + .../drivers/nouveau/codegen/nv50_ir.cpp |3 + src/gallium/drivers/nouveau/codegen/nv50_ir.h |1 + .../nouveau/codegen/nv50_ir_from_common.cpp | 107 + .../nouveau/codegen/nv50_ir_from_common.h | 58 + .../nouveau/codegen/nv50_ir_from_nir.cpp | 3322 + .../nouveau/codegen/nv50_ir_from_tgsi.cpp | 106 +- .../codegen/nv50_ir_lowering_helper.cpp | 275 ++ .../nouveau/codegen/nv50_ir_lowering_helper.h | 53 + src/gallium/drivers/nouveau/meson.build | 13 +- src/gallium/drivers/nouveau/nouveau_screen.c | 10 +- src/gallium/drivers/nouveau/nouveau_screen.h |5 + .../drivers/nouveau/nv50/nv50_program.c | 19 +- .../drivers/nouveau/nv50/nv50_screen.c| 46 +- src/gallium/drivers/nouveau/nv50/nv50_state.c | 35 +- .../drivers/nouveau/nvc0/nvc0_program.c | 19 +- .../drivers/nouveau/nvc0/nvc0_screen.c| 94 +- src/gallium/drivers/nouveau/nvc0/nvc0_state.c | 31 +- src/mesa/program/prog_to_nir.c|6 +- 21 files changed, 4088 insertions(+), 128 deletions(-) create mode 100644 src/gallium/drivers/nouveau/codegen/nv50_ir_from_common.cpp create mode 100644 src/gallium/drivers/nouveau/codegen/nv50_ir_from_common.h create mode 100644 src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp create mode 100644 src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_helper.cpp create mode 100644 src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_helper.h -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] st/mesa: init hash keys with memset(), not designated initializers
Brian Paul writes: > Since the compiler may not zero-out padding in the object. > Add a couple comments about this to prevent misunderstandings in > the future. > > Fixes: 67d96816ff5 ("st/mesa: move, clean-up shader variant key decls/inits") > --- > src/mesa/state_tracker/st_atom_shader.c | 9 +++-- > src/mesa/state_tracker/st_program.c | 13 ++--- > 2 files changed, 17 insertions(+), 5 deletions(-) > > diff --git a/src/mesa/state_tracker/st_atom_shader.c > b/src/mesa/state_tracker/st_atom_shader.c > index ac7a1a5..a4475e2 100644 > --- a/src/mesa/state_tracker/st_atom_shader.c > +++ b/src/mesa/state_tracker/st_atom_shader.c > @@ -112,7 +112,10 @@ st_update_fp( struct st_context *st ) > !stfp->variants->key.bitmap) { >shader = stfp->variants->driver_shader; > } else { > - struct st_fp_variant_key key = {0}; > + struct st_fp_variant_key key; > + > + /* use memset, not an initializer to be sure all memory is zeroed */ > + memset(, 0, sizeof(key)); Wait, what? We rely on this form of initialization all over, what's changed? signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] radv: fix pointSizeRange limits
r-b On Mon, Mar 11, 2019 at 10:23 AM Samuel Pitoiset wrote: > > The values should match the ones that are emitted. > > This fixes new CTS dEQP-VK.rasterization.primitive_size.points.*. > > Fixes: f4e499ec791 ("radv: add initial non-conformant radv vulkan driver") > Signed-off-by: Samuel Pitoiset > --- > src/amd/vulkan/radv_device.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c > index fc04de21025..83d218fb6bf 100644 > --- a/src/amd/vulkan/radv_device.c > +++ b/src/amd/vulkan/radv_device.c > @@ -1016,7 +1016,7 @@ void radv_GetPhysicalDeviceProperties( > .maxCullDistances = 8, > .maxCombinedClipAndCullDistances = 8, > .discreteQueuePriorities = 2, > - .pointSizeRange = { 0.125, 255.875 > }, > + .pointSizeRange = { 0.0, 8192.0 }, > .lineWidthRange = { 0.0, 7.9921875 > }, > .pointSizeGranularity = (1.0 / 8.0), > .lineWidthGranularity = (1.0 / 128.0), > -- > 2.21.0 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/9] anv: Set the values for the VkPhysicalDeviceSampleLocationsPropertiesEXT
On Mon, 2019-03-11 at 17:04 +0200, Eleni Maria Stea wrote: > The VkPhysicalDeviceSampleLocationPropertiesEXT struct is filled with > implementation dependent values and according to the table from the > Vulkan Specification section [36.1. Limit Requirements]: > > pname | max | min > pname:sampleLocationSampleCounts |-|ename:VK_SAMPLE_COU > NT_4_BIT > pname:maxSampleLocationGridSize|-|(1, 1) > pname:sampleLocationCoordinateRange|(0.0, 0.9375)|(0.0, 0.9375) > pname:sampleLocationSubPixelBits |-|4 > pname:variableSampleLocations | false |implementation > dependent > > The hardware only supports setting the same sample location for all > the > pixels, so we only support 1x1 grids. > > Also, variableSampleLocations is set to false because we don't > support the > feature. > --- > src/intel/vulkan/anv_device.c | 21 + > src/intel/vulkan/anv_private.h | 3 +++ > 2 files changed, 24 insertions(+) > > diff --git a/src/intel/vulkan/anv_device.c > b/src/intel/vulkan/anv_device.c > index 729cceb3e32..1e183b7f4ad 100644 > --- a/src/intel/vulkan/anv_device.c > +++ b/src/intel/vulkan/anv_device.c > @@ -1401,6 +1401,27 @@ void anv_GetPhysicalDeviceProperties2( > break; >} > > + case > VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_SAMPLE_LOCATIONS_PROPERTIES_EXT: { > + VkPhysicalDeviceSampleLocationsPropertiesEXT *props = > +(VkPhysicalDeviceSampleLocationsPropertiesEXT *)ext; > + props->sampleLocationSampleCounts = ISL_SAMPLE_COUNT_2_BIT > | > + ISL_SAMPLE_COUNT_4_BIT > | > + ISL_SAMPLE_COUNT_8_BIT; > + if (pdevice->info.gen >= 9) > +props->sampleLocationSampleCounts |= > ISL_SAMPLE_COUNT_16_BIT; Hi Eleni, Thanks for the series. "isl_device_get_sample_counts" method figure out values according to platform so maybe we can make use of it and ignore ISL_SAMPLE_COUNT_1_BIT. So that we don't have to take care of values according to platform here. I am not sure about this, so it might be a good idea to consult with Jason/Lionel once. :) > + > + props->maxSampleLocationGridSize.width = SAMPLE_LOC_GRID_W; > + props->maxSampleLocationGridSize.height = > SAMPLE_LOC_GRID_H; > + > + props->sampleLocationCoordinateRange[0] = 0; > + props->sampleLocationCoordinateRange[1] = 0.9375; > + props->sampleLocationSubPixelBits = 4; > + > + props->variableSampleLocations = false; Just for consistency, doesn't make any difference but can we use VK_FALSE instead of false. with or without the fix, this patch is: Reviewed-by: Sagar Ghuge > + > + break; > + } > + >default: > anv_debug_ignored_stype(ext->sType); > break; > diff --git a/src/intel/vulkan/anv_private.h > b/src/intel/vulkan/anv_private.h > index eed282ff985..5905299e59d 100644 > --- a/src/intel/vulkan/anv_private.h > +++ b/src/intel/vulkan/anv_private.h > @@ -195,6 +195,9 @@ struct gen_l3_config; > > #define anv_printflike(a, b) __attribute__((__format__(__printf__, > a, b))) > > +#define SAMPLE_LOC_GRID_W 1 > +#define SAMPLE_LOC_GRID_H 1 > + > static inline uint32_t > align_down_npot_u32(uint32_t v, uint32_t a) > { ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/9] anv: Set the values for the VkPhysicalDeviceSampleLocationsPropertiesEXT
On Mon, 2019-03-11 at 17:04 +0200, Eleni Maria Stea wrote: > The VkPhysicalDeviceSampleLocationPropertiesEXT struct is filled with > implementation dependent values and according to the table from the > Vulkan Specification section [36.1. Limit Requirements]: > > pname | max | min > pname:sampleLocationSampleCounts |-|ename:VK_SAMPLE_COU > NT_4_BIT > pname:maxSampleLocationGridSize|-|(1, 1) > pname:sampleLocationCoordinateRange|(0.0, 0.9375)|(0.0, 0.9375) > pname:sampleLocationSubPixelBits |-|4 > pname:variableSampleLocations | false |implementation > dependent > > The hardware only supports setting the same sample location for all > the > pixels, so we only support 1x1 grids. > > Also, variableSampleLocations is set to false because we don't > support the > feature. > --- > src/intel/vulkan/anv_device.c | 21 + > src/intel/vulkan/anv_private.h | 3 +++ > 2 files changed, 24 insertions(+) > > diff --git a/src/intel/vulkan/anv_device.c > b/src/intel/vulkan/anv_device.c > index 729cceb3e32..1e183b7f4ad 100644 > --- a/src/intel/vulkan/anv_device.c > +++ b/src/intel/vulkan/anv_device.c > @@ -1401,6 +1401,27 @@ void anv_GetPhysicalDeviceProperties2( > break; >} > > + case > VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_SAMPLE_LOCATIONS_PROPERTIES_EXT: { > + VkPhysicalDeviceSampleLocationsPropertiesEXT *props = > +(VkPhysicalDeviceSampleLocationsPropertiesEXT *)ext; > + props->sampleLocationSampleCounts = ISL_SAMPLE_COUNT_2_BIT > | > + ISL_SAMPLE_COUNT_4_BIT > | > + ISL_SAMPLE_COUNT_8_BIT; > + if (pdevice->info.gen >= 9) > +props->sampleLocationSampleCounts |= > ISL_SAMPLE_COUNT_16_BIT; Hi Eleni, Thanks for the series. "isl_device_get_sample_counts" method figure out values according to platform so maybe we can make use of it and ignore ISL_SAMPLE_COUNT_1_BIT. So that we don't have to take care of values according to platform here. I am not sure about this, so it might be a good idea to consult with Jason/Lionel once. :) > + > + props->maxSampleLocationGridSize.width = SAMPLE_LOC_GRID_W; > + props->maxSampleLocationGridSize.height = > SAMPLE_LOC_GRID_H; > + > + props->sampleLocationCoordinateRange[0] = 0; > + props->sampleLocationCoordinateRange[1] = 0.9375; > + props->sampleLocationSubPixelBits = 4; > + > + props->variableSampleLocations = false; Just for consistency, doesn't make any difference but can we use VK_FALSE instead of false. with or without the fix, this patch is: Reviewed-by: Sagar Ghuge > + > + break; > + } > + >default: > anv_debug_ignored_stype(ext->sType); > break; > diff --git a/src/intel/vulkan/anv_private.h > b/src/intel/vulkan/anv_private.h > index eed282ff985..5905299e59d 100644 > --- a/src/intel/vulkan/anv_private.h > +++ b/src/intel/vulkan/anv_private.h > @@ -195,6 +195,9 @@ struct gen_l3_config; > > #define anv_printflike(a, b) __attribute__((__format__(__printf__, > a, b))) > > +#define SAMPLE_LOC_GRID_W 1 > +#define SAMPLE_LOC_GRID_H 1 > + > static inline uint32_t > align_down_npot_u32(uint32_t v, uint32_t a) > { ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 109939] After upgrade mesa to 19.0.0 stop working the game Rise of the Tomb Raider
https://bugs.freedesktop.org/show_bug.cgi?id=109939 --- Comment #15 from Timur Kristóf --- Will attach the contents of the preferences folder next time I encounter this problem. -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [AppVeyor] mesa master #10355 completed
Build mesa 10355 completed Commit 3a9e2d6085 by Eric Anholt on 2/28/2019 8:02 PM: vc4: Switch the post-RA scheduler over to the DAG datastructure.\n\nJust a small code reduction from shared infrastructure. Configure your notification preferences ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] anv: destroy descriptor sets when pool gets reset
On Mon, Mar 11, 2019 at 06:33:54PM +0100, Juan A. Suarez Romero wrote: As stated in Vulkan spec: "Resetting a descriptor pool recycles all of the resources from all of the descriptor sets allocated from the descriptor pool back to the descriptor pool, and the descriptor sets are implicitly freed." This fixes dEQP-VK.api.descriptor_pool.* Fixes: 14f6275c92f1 ("anv/descriptor_set: add reference counting for descriptor set layouts") I ran this though CI and these tests are no longer failing. I didn't see any regressions either. CC: Tapani Pälli CC: Lionel Landwerlin CC: Jason Ekstrand --- src/intel/vulkan/anv_descriptor_set.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/src/intel/vulkan/anv_descriptor_set.c b/src/intel/vulkan/anv_descriptor_set.c index f293cf469ee..f34a44aefd7 100644 --- a/src/intel/vulkan/anv_descriptor_set.c +++ b/src/intel/vulkan/anv_descriptor_set.c @@ -636,6 +636,12 @@ VkResult anv_ResetDescriptorPool( } anv_state_stream_finish(>surface_state_stream); + + list_for_each_entry_safe(struct anv_descriptor_set, set, +>desc_sets, pool_link) { + anv_descriptor_set_destroy(device, pool, set); + } + anv_state_stream_init(>surface_state_stream, >surface_state_pool, 4096); pool->surface_state_free_list = NULL; -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 109939] After upgrade mesa to 19.0.0 stop working the game Rise of the Tomb Raider
https://bugs.freedesktop.org/show_bug.cgi?id=109939 --- Comment #14 from mikhail.v.gavri...@gmail.com --- Alex, I am attach "Rise of the Tomb Raider" archive here. For reproducing problem after continue the game. Try fast travel to locations: "The Gulag" and "Valley Farmstead" -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 109939] After upgrade mesa to 19.0.0 stop working the game Rise of the Tomb Raider
https://bugs.freedesktop.org/show_bug.cgi?id=109939 --- Comment #13 from mikhail.v.gavri...@gmail.com --- Created attachment 143627 --> https://bugs.freedesktop.org/attachment.cgi?id=143627=edit Rise of the Tomb Raider -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [AppVeyor] mesa master #10354 failed
Build mesa 10354 failed Commit 5f0a922c27 by Kristian H. Kristensen on 3/1/2019 10:33 PM: freedreno/a6xx: Remove extra parens\n\nThere's a warning about this now.\n\nSigned-off-by: Kristian H. Kristensen Configure your notification preferences ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] anv: destroy descriptor sets when pool gets reset
Reviewed-by: Jason Ekstrand On Mon, Mar 11, 2019 at 12:34 PM Juan A. Suarez Romero wrote: > As stated in Vulkan spec: >"Resetting a descriptor pool recycles all of the resources from all > of the descriptor sets allocated from the descriptor pool back to > the descriptor pool, and the descriptor sets are implicitly freed." > > This fixes dEQP-VK.api.descriptor_pool.* > > Fixes: 14f6275c92f1 ("anv/descriptor_set: add reference counting for > descriptor set layouts") > CC: Tapani Pälli > CC: Lionel Landwerlin > CC: Jason Ekstrand > --- > src/intel/vulkan/anv_descriptor_set.c | 6 ++ > 1 file changed, 6 insertions(+) > > diff --git a/src/intel/vulkan/anv_descriptor_set.c > b/src/intel/vulkan/anv_descriptor_set.c > index f293cf469ee..f34a44aefd7 100644 > --- a/src/intel/vulkan/anv_descriptor_set.c > +++ b/src/intel/vulkan/anv_descriptor_set.c > @@ -636,6 +636,12 @@ VkResult anv_ResetDescriptorPool( > } > > anv_state_stream_finish(>surface_state_stream); > + > + list_for_each_entry_safe(struct anv_descriptor_set, set, > +>desc_sets, pool_link) { > + anv_descriptor_set_destroy(device, pool, set); > + } > + > anv_state_stream_init(>surface_state_stream, > >surface_state_pool, 4096); > pool->surface_state_free_list = NULL; > -- > 2.20.1 > > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] anv: destroy descriptor sets when pool gets reset
As stated in Vulkan spec: "Resetting a descriptor pool recycles all of the resources from all of the descriptor sets allocated from the descriptor pool back to the descriptor pool, and the descriptor sets are implicitly freed." This fixes dEQP-VK.api.descriptor_pool.* Fixes: 14f6275c92f1 ("anv/descriptor_set: add reference counting for descriptor set layouts") CC: Tapani Pälli CC: Lionel Landwerlin CC: Jason Ekstrand --- src/intel/vulkan/anv_descriptor_set.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/src/intel/vulkan/anv_descriptor_set.c b/src/intel/vulkan/anv_descriptor_set.c index f293cf469ee..f34a44aefd7 100644 --- a/src/intel/vulkan/anv_descriptor_set.c +++ b/src/intel/vulkan/anv_descriptor_set.c @@ -636,6 +636,12 @@ VkResult anv_ResetDescriptorPool( } anv_state_stream_finish(>surface_state_stream); + + list_for_each_entry_safe(struct anv_descriptor_set, set, +>desc_sets, pool_link) { + anv_descriptor_set_destroy(device, pool, set); + } + anv_state_stream_init(>surface_state_stream, >surface_state_pool, 4096); pool->surface_state_free_list = NULL; -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 109535] [Tracker] Mesa 19.0 release tracker
https://bugs.freedesktop.org/show_bug.cgi?id=109535 Dylan Baker changed: What|Removed |Added Depends on|109695 | Referenced Bugs: https://bugs.freedesktop.org/show_bug.cgi?id=109695 [Bug 109695] qemu using spice gl and sandbox resourcecontrol=deny crashes with SIGSYS on radeonsi -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 9/9] anv: Enabled the VK_EXT_sample_locations extension
On 11/03/2019 15:04, Eleni Maria Stea wrote: Enabled the VK_EXT_sample_locations for Intel Gen >= 7. --- src/intel/vulkan/anv_extensions.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/intel/vulkan/anv_extensions.py b/src/intel/vulkan/anv_extensions.py index 9e4e03e46df..99007544732 100644 --- a/src/intel/vulkan/anv_extensions.py +++ b/src/intel/vulkan/anv_extensions.py @@ -129,7 +129,7 @@ EXTENSIONS = [ Extension('VK_EXT_inline_uniform_block', 1, True), Extension('VK_EXT_pci_bus_info', 2, True), Extension('VK_EXT_post_depth_coverage', 1, 'device->info.gen >= 9'), -Extension('VK_EXT_sample_locations', 1, False), +Extension('VK_EXT_sample_locations', 1, 'device->info.gen >= 7'), Extension('VK_EXT_sampler_filter_minmax', 1, 'device->info.gen >= 9'), Extension('VK_EXT_scalar_block_layout', 1, True), Extension('VK_EXT_shader_stencil_export', 1, 'device->info.gen >= 9'), Hi Eleni, Anv doesn't support anything below gen7 so I could just put True there ;) -Lionel ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 11/12] panfrost: Identify fragment_extra flags
One more brick in the wall :) Reviewed-by: Tomeu Vizoso On Sun, 10 Mar 2019 at 07:50, Alyssa Rosenzweig wrote: > > The fragment_extra structure contains additional fields extending the > MRT framebuffer descriptor, snuck in between the main framebuffer > descriptor and the render targets. Its fields include those related to > transaction elimination and depth/stencil buffers. This patch identifies > the flags field (previously just "unk" with some magic values) as well > as identifying some (but not all) flags set by the driver. > > The process of identifying flags brought a bug to light where > transaction elimination (checksumming) could not be enabled unless AFBC > was in-use. This issue is now resolved. > > Signed-off-by: Alyssa Rosenzweig > --- > .../drivers/panfrost/include/panfrost-job.h| 8 +++- > src/gallium/drivers/panfrost/pan_context.c | 16 > .../drivers/panfrost/pandecode/decode.c| 18 +- > 3 files changed, 28 insertions(+), 14 deletions(-) > > diff --git a/src/gallium/drivers/panfrost/include/panfrost-job.h > b/src/gallium/drivers/panfrost/include/panfrost-job.h > index 3c5ed2bc802..dccd8410ae9 100644 > --- a/src/gallium/drivers/panfrost/include/panfrost-job.h > +++ b/src/gallium/drivers/panfrost/include/panfrost-job.h > @@ -1419,12 +1419,18 @@ struct bifrost_render_target { > * - TODO: Anything else? > */ > > +/* Flags */ > +#define MALI_EXTRA_PRESENT (0x400) > +#define MALI_EXTRA_AFBC (0x20) > +#define MALI_EXTRA_AFBC_ZS (0x10) > +#define MALI_EXTRA_ZS (0x4) > + > struct bifrost_fb_extra { > mali_ptr checksum; > /* Each tile has an 8 byte checksum, so the stride is "width in > tiles * 8" */ > u32 checksum_stride; > > -u32 unk; > +u32 flags; > > union { > /* Note: AFBC is only allowed for 24/8 combined > depth/stencil. */ > diff --git a/src/gallium/drivers/panfrost/pan_context.c > b/src/gallium/drivers/panfrost/pan_context.c > index 419c1a4eb6f..cdced27f101 100644 > --- a/src/gallium/drivers/panfrost/pan_context.c > +++ b/src/gallium/drivers/panfrost/pan_context.c > @@ -263,12 +263,12 @@ panfrost_set_fragment_target_zsbuf( > ctx->fragment_extra.ds_afbc.zero1 = 0x10009; > ctx->fragment_extra.ds_afbc.padding = 0x1000; > > -/* There's a general 0x400 in all versions of this field > seen. > - * ORed with 0x5 for depth/stencil. ORed 0x10 for AFBC > encoded > - * depth stencil -- er, no. It's unclear where the remaining > 0x20 bit > - * is from, checksumming maybe? */ > - > -ctx->fragment_extra.unk = 0x400 | 0x20 | 0x10 | 0x5; > +ctx->fragment_extra.flags = > +MALI_EXTRA_PRESENT | > +MALI_EXTRA_AFBC | > +MALI_EXTRA_AFBC_ZS | > +MALI_EXTRA_ZS | > +0x1; /* unknown */ > > ctx->fragment_mfbd.unk3 |= MALI_MFBD_DEPTH_WRITE; > } else if (rsrc->bo->layout == PAN_LINEAR) { > @@ -278,7 +278,7 @@ panfrost_set_fragment_target_zsbuf( > /* Setup combined 24/8 depth/stencil */ > ctx->fragment_mfbd.unk3 |= MALI_MFBD_EXTRA; > > -ctx->fragment_extra.unk = 0x404; > +ctx->fragment_extra.flags = MALI_EXTRA_PRESENT | > MALI_EXTRA_ZS; > ctx->fragment_extra.ds_linear.depth = rsrc->bo->gpu[0]; > ctx->fragment_extra.ds_linear.depth_stride = stride; > > @@ -1027,7 +1027,7 @@ panfrost_fragment_job(struct panfrost_context *ctx) > int stride = > util_format_get_stride(rsrc->base.format, rsrc->base.width0); > > ctx->fragment_mfbd.unk3 |= MALI_MFBD_EXTRA; > -ctx->fragment_extra.unk |= 0x420; > +ctx->fragment_extra.flags |= MALI_EXTRA_PRESENT; > ctx->fragment_extra.checksum_stride = > rsrc->bo->checksum_stride; > ctx->fragment_extra.checksum = rsrc->bo->gpu[0] + > stride * rsrc->base.height0; > } > diff --git a/src/gallium/drivers/panfrost/pandecode/decode.c > b/src/gallium/drivers/panfrost/pandecode/decode.c > index ea635bbe981..e6932744939 100644 > --- a/src/gallium/drivers/panfrost/pandecode/decode.c > +++ b/src/gallium/drivers/panfrost/pandecode/decode.c > @@ -209,6 +209,15 @@ static const struct pandecode_flag_info > mfbd_fmt_flag_info[] = { > }; > #undef FLAG_INFO > > +#define FLAG_INFO(flag) { MALI_EXTRA_##flag, "MALI_EXTRA_" #flag } > +static const struct pandecode_flag_info mfbd_extra_flag_info[] = { > +FLAG_INFO(PRESENT), > +FLAG_INFO(AFBC), > +FLAG_INFO(ZS), > +{} > +}; > +#undef FLAG_INFO > + > extern char *replace_fragment; > extern char *replace_vertex; > > @@ -604,12 +613,11
[Mesa-dev] [PATCH 9/9] anv: Enabled the VK_EXT_sample_locations extension
Enabled the VK_EXT_sample_locations for Intel Gen >= 7. --- src/intel/vulkan/anv_extensions.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/intel/vulkan/anv_extensions.py b/src/intel/vulkan/anv_extensions.py index 9e4e03e46df..99007544732 100644 --- a/src/intel/vulkan/anv_extensions.py +++ b/src/intel/vulkan/anv_extensions.py @@ -129,7 +129,7 @@ EXTENSIONS = [ Extension('VK_EXT_inline_uniform_block', 1, True), Extension('VK_EXT_pci_bus_info', 2, True), Extension('VK_EXT_post_depth_coverage', 1, 'device->info.gen >= 9'), -Extension('VK_EXT_sample_locations', 1, False), +Extension('VK_EXT_sample_locations', 1, 'device->info.gen >= 7'), Extension('VK_EXT_sampler_filter_minmax', 1, 'device->info.gen >= 9'), Extension('VK_EXT_scalar_block_layout', 1, True), Extension('VK_EXT_shader_stencil_export', 1, 'device->info.gen >= 9'), -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 8/9] anv: Removed unused header file
In src/intel/vulkan/genX_blorp_exec.c we included the file: common/gen_sample_positions.h but not use it. Removed. --- src/intel/vulkan/genX_blorp_exec.c | 1 - 1 file changed, 1 deletion(-) diff --git a/src/intel/vulkan/genX_blorp_exec.c b/src/intel/vulkan/genX_blorp_exec.c index e9c85d56d5f..0eeefaaa9d6 100644 --- a/src/intel/vulkan/genX_blorp_exec.c +++ b/src/intel/vulkan/genX_blorp_exec.c @@ -31,7 +31,6 @@ #undef __gen_combine_address #include "common/gen_l3_config.h" -#include "common/gen_sample_positions.h" #include "blorp/blorp_genX_exec.h" static void * -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 7/9] anv: Optimized the emission of the default locations on Gen8+
We only emit sample locations when the extension is enabled by the user. In all other cases the default locations are emitted once when the device is initialized to increase performance. --- src/intel/vulkan/anv_genX.h| 3 ++- src/intel/vulkan/genX_cmd_buffer.c | 2 +- src/intel/vulkan/genX_pipeline.c | 11 +++ src/intel/vulkan/genX_state.c | 8 +--- 4 files changed, 15 insertions(+), 9 deletions(-) diff --git a/src/intel/vulkan/anv_genX.h b/src/intel/vulkan/anv_genX.h index e82d83465ef..7f33a2b0a68 100644 --- a/src/intel/vulkan/anv_genX.h +++ b/src/intel/vulkan/anv_genX.h @@ -93,4 +93,5 @@ void genX(emit_ms_state)(struct anv_batch *batch, struct anv_sample *anv_samples, uint32_t num_samples, uint32_t log2_samples, - bool custom_sample_locations); + bool custom_sample_locations, + bool sample_locations_ext_enabled); diff --git a/src/intel/vulkan/genX_cmd_buffer.c b/src/intel/vulkan/genX_cmd_buffer.c index 4752c66f350..ae7c5a80a3c 100644 --- a/src/intel/vulkan/genX_cmd_buffer.c +++ b/src/intel/vulkan/genX_cmd_buffer.c @@ -2654,7 +2654,7 @@ cmd_buffer_emit_sample_locations(struct anv_cmd_buffer *cmd_buffer) anv_samples = cmd_buffer->state.gfx.dynamic.sample_locations.anv_samples; genX(emit_ms_state)(_buffer->batch, anv_samples, samples, - log2_samples, true); + log2_samples, true, true); } void diff --git a/src/intel/vulkan/genX_pipeline.c b/src/intel/vulkan/genX_pipeline.c index 8afc08f0320..12adfa65da8 100644 --- a/src/intel/vulkan/genX_pipeline.c +++ b/src/intel/vulkan/genX_pipeline.c @@ -573,10 +573,12 @@ emit_sample_mask(struct anv_pipeline *pipeline, } static void -emit_ms_state(struct anv_pipeline *pipeline, +emit_ms_state(struct anv_device *device, + struct anv_pipeline *pipeline, const VkPipelineMultisampleStateCreateInfo *info, const VkPipelineDynamicStateCreateInfo *dinfo) { + bool sample_loc_enabled = device->enabled_extensions.EXT_sample_locations; struct anv_sample anv_samples[MAX_SAMPLE_LOCATIONS]; VkSampleLocationsInfoEXT *sl; bool custom_locations = false; @@ -588,7 +590,7 @@ emit_ms_state(struct anv_pipeline *pipeline, if (info) { samples = info->rasterizationSamples; - if (info->pNext) { + if (sample_loc_enabled && info->pNext) { VkPipelineSampleLocationsStateCreateInfoEXT *slinfo = (VkPipelineSampleLocationsStateCreateInfoEXT *)info->pNext; @@ -617,7 +619,7 @@ emit_ms_state(struct anv_pipeline *pipeline, } genX(emit_ms_state)(>batch, anv_samples, samples, log2_samples, - custom_locations); + custom_locations, sample_loc_enabled); } static const uint32_t vk_to_gen_logic_op[] = { @@ -1947,7 +1949,8 @@ genX(graphics_pipeline_create)( assert(pCreateInfo->pRasterizationState); emit_rs_state(pipeline, pCreateInfo->pRasterizationState, pCreateInfo->pMultisampleState, pass, subpass); - emit_ms_state(pipeline, pCreateInfo->pMultisampleState, pCreateInfo->pDynamicState); + emit_ms_state(device, pipeline, pCreateInfo->pMultisampleState, + pCreateInfo->pDynamicState); emit_ds_state(pipeline, pCreateInfo->pDepthStencilState, pass, subpass); emit_cb_state(pipeline, pCreateInfo->pColorBlendState, pCreateInfo->pMultisampleState); diff --git a/src/intel/vulkan/genX_state.c b/src/intel/vulkan/genX_state.c index 804cfab3a56..bc6b5870d8d 100644 --- a/src/intel/vulkan/genX_state.c +++ b/src/intel/vulkan/genX_state.c @@ -552,12 +552,14 @@ genX(emit_ms_state)(struct anv_batch *batch, struct anv_sample *anv_samples, uint32_t num_samples, uint32_t log2_samples, - bool custom_sample_locations) + bool custom_sample_locations, + bool sample_locations_ext_enabled) { emit_multisample(batch, anv_samples, num_samples, log2_samples, custom_sample_locations); #if GEN_GEN >= 8 - emit_sample_locations(batch, anv_samples, num_samples, - custom_sample_locations); + if (sample_locations_ext_enabled) + emit_sample_locations(batch, anv_samples, num_samples, +custom_sample_locations); #endif } -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/9] anv: Implemented the vkGetPhysicalDeviceMultisamplePropertiesEXT
Implemented the vkGetPhysicalDeviceMultisamplePropertiesEXT according to the Vulkan Specification section [36.2. Additional Multisampling Capabilities]. --- src/intel/Makefile.sources | 1 + src/intel/vulkan/anv_sample_locations.c | 60 + src/intel/vulkan/meson.build| 1 + 3 files changed, 62 insertions(+) create mode 100644 src/intel/vulkan/anv_sample_locations.c diff --git a/src/intel/Makefile.sources b/src/intel/Makefile.sources index a5c8828a6b6..a0873c7ccc2 100644 --- a/src/intel/Makefile.sources +++ b/src/intel/Makefile.sources @@ -251,6 +251,7 @@ VULKAN_FILES := \ vulkan/anv_pipeline_cache.c \ vulkan/anv_private.h \ vulkan/anv_queue.c \ + vulkan/anv_sample_locations.c \ vulkan/anv_util.c \ vulkan/anv_wsi.c \ vulkan/vk_format_info.h diff --git a/src/intel/vulkan/anv_sample_locations.c b/src/intel/vulkan/anv_sample_locations.c new file mode 100644 index 000..1ebf280e05b --- /dev/null +++ b/src/intel/vulkan/anv_sample_locations.c @@ -0,0 +1,60 @@ +/* + * Copyright © 2019 Intel Corporation + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + */ + +#include "anv_private.h" + +void +anv_GetPhysicalDeviceMultisamplePropertiesEXT(VkPhysicalDevice physicalDevice, + VkSampleCountFlagBits samples, + VkMultisamplePropertiesEXT + *pMultisampleProperties) +{ + ANV_FROM_HANDLE(anv_physical_device, physical_device, physicalDevice); + const struct gen_device_info *devinfo = _device->info; + + VkExtent2D grid_size; + switch (samples) { + case VK_SAMPLE_COUNT_2_BIT: + case VK_SAMPLE_COUNT_4_BIT: + case VK_SAMPLE_COUNT_8_BIT: + grid_size.width = SAMPLE_LOC_GRID_W; + grid_size.height = SAMPLE_LOC_GRID_H; + break; + + case VK_SAMPLE_COUNT_16_BIT: + if (devinfo->gen >= 9) { + grid_size.width = SAMPLE_LOC_GRID_W; + grid_size.height = SAMPLE_LOC_GRID_H; + break; + } + default: + grid_size.width = grid_size.height = 0; + break; + }; + + *pMultisampleProperties = (VkMultisamplePropertiesEXT) { + .sType = VK_STRUCTURE_TYPE_MULTISAMPLE_PROPERTIES_EXT, + .pNext = NULL, + .maxSampleLocationGridSize = grid_size + }; +} diff --git a/src/intel/vulkan/meson.build b/src/intel/vulkan/meson.build index 7fa43a6ad79..3f78757c774 100644 --- a/src/intel/vulkan/meson.build +++ b/src/intel/vulkan/meson.build @@ -135,6 +135,7 @@ libanv_files = files( 'anv_pipeline_cache.c', 'anv_private.h', 'anv_queue.c', + 'anv_sample_locations.c', 'anv_util.c', 'anv_wsi.c', 'vk_format_info.h', -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 6/9] anv: Added support for dynamic and non-dynamic sample locations on Gen7
Allowing setting dynamic and non-dynamic sample locations on Gen7. --- src/intel/vulkan/anv_genX.h| 13 ++--- src/intel/vulkan/genX_cmd_buffer.c | 9 ++-- src/intel/vulkan/genX_pipeline.c | 13 + src/intel/vulkan/genX_state.c | 86 +- 4 files changed, 70 insertions(+), 51 deletions(-) diff --git a/src/intel/vulkan/anv_genX.h b/src/intel/vulkan/anv_genX.h index f84fe457152..e82d83465ef 100644 --- a/src/intel/vulkan/anv_genX.h +++ b/src/intel/vulkan/anv_genX.h @@ -89,11 +89,8 @@ void genX(cmd_buffer_mi_memset)(struct anv_cmd_buffer *cmd_buffer, void genX(blorp_exec)(struct blorp_batch *batch, const struct blorp_params *params); -void genX(emit_multisample)(struct anv_batch *batch, -uint32_t samples, -uint32_t log2_samples); - -void genX(emit_sample_locations)(struct anv_batch *batch, - const struct anv_sample *anv_samples, - uint32_t num_samples, - bool custom_locations); +void genX(emit_ms_state)(struct anv_batch *batch, + struct anv_sample *anv_samples, + uint32_t num_samples, + uint32_t log2_samples, + bool custom_sample_locations); diff --git a/src/intel/vulkan/genX_cmd_buffer.c b/src/intel/vulkan/genX_cmd_buffer.c index 9229df84caa..4752c66f350 100644 --- a/src/intel/vulkan/genX_cmd_buffer.c +++ b/src/intel/vulkan/genX_cmd_buffer.c @@ -2643,8 +2643,7 @@ cmd_buffer_flush_push_constants(struct anv_cmd_buffer *cmd_buffer, static void cmd_buffer_emit_sample_locations(struct anv_cmd_buffer *cmd_buffer) { -#if GEN_GEN >= 8 - const struct anv_sample *anv_samples; + struct anv_sample *anv_samples; uint32_t log2_samples; uint32_t samples; @@ -2654,10 +2653,8 @@ cmd_buffer_emit_sample_locations(struct anv_cmd_buffer *cmd_buffer) log2_samples = __builtin_ffs(samples) - 1; anv_samples = cmd_buffer->state.gfx.dynamic.sample_locations.anv_samples; - genX(emit_multisample)(_buffer->batch, samples, log2_samples); - genX(emit_sample_locations)(_buffer->batch, anv_samples, samples, - true); -#endif + genX(emit_ms_state)(_buffer->batch, anv_samples, samples, + log2_samples, true); } void diff --git a/src/intel/vulkan/genX_pipeline.c b/src/intel/vulkan/genX_pipeline.c index fa42e622077..8afc08f0320 100644 --- a/src/intel/vulkan/genX_pipeline.c +++ b/src/intel/vulkan/genX_pipeline.c @@ -577,12 +577,9 @@ emit_ms_state(struct anv_pipeline *pipeline, const VkPipelineMultisampleStateCreateInfo *info, const VkPipelineDynamicStateCreateInfo *dinfo) { -#if GEN_GEN >= 8 struct anv_sample anv_samples[MAX_SAMPLE_LOCATIONS]; VkSampleLocationsInfoEXT *sl; bool custom_locations = false; -#endif - uint32_t samples = 1; uint32_t log2_samples = 0; @@ -591,7 +588,6 @@ emit_ms_state(struct anv_pipeline *pipeline, if (info) { samples = info->rasterizationSamples; -#if GEN_GEN >= 8 if (info->pNext) { VkPipelineSampleLocationsStateCreateInfoEXT *slinfo = (VkPipelineSampleLocationsStateCreateInfoEXT *)info->pNext; @@ -616,17 +612,12 @@ emit_ms_state(struct anv_pipeline *pipeline, } } } -#endif log2_samples = __builtin_ffs(samples) - 1; } - genX(emit_multisample(>batch, samples, log2_samples)); - -#if GEN_GEN >= 8 - genX(emit_sample_locations)(>batch, anv_samples, samples, - custom_locations); -#endif + genX(emit_ms_state)(>batch, anv_samples, samples, log2_samples, + custom_locations); } static const uint32_t vk_to_gen_logic_op[] = { diff --git a/src/intel/vulkan/genX_state.c b/src/intel/vulkan/genX_state.c index 44cfc925ed5..804cfab3a56 100644 --- a/src/intel/vulkan/genX_state.c +++ b/src/intel/vulkan/genX_state.c @@ -437,10 +437,12 @@ VkResult genX(CreateSampler)( return VK_SUCCESS; } -void -genX(emit_multisample)(struct anv_batch *batch, - uint32_t samples, - uint32_t log2_samples) +static void +emit_multisample(struct anv_batch *batch, + const struct anv_sample *anv_samples, + uint32_t samples, + uint32_t log2_samples, + bool custom_locations) { anv_batch_emit(batch, GENX(3DSTATE_MULTISAMPLE), ms) { ms.NumberofMultisamples = log2_samples; @@ -453,34 +455,52 @@ genX(emit_multisample)(struct anv_batch *batch, */ ms.PixelPositionOffsetEnable = false; #else - switch (samples) { - case 1: - GEN_SAMPLE_POS_1X(ms.Sample); - break; - case 2: - GEN_SAMPLE_POS_2X(ms.Sample); - break; - case 4: - GEN_SAMPLE_POS_4X(ms.Sample); - break; -
[Mesa-dev] [PATCH 1/9] anv: Added the VK_EXT_sample_locations extension to the anv_extensions list
Added the VK_EXT_sample_locations to the anv_extensions.py list to generate the related entrypoints. --- src/intel/vulkan/anv_extensions.py | 1 + 1 file changed, 1 insertion(+) diff --git a/src/intel/vulkan/anv_extensions.py b/src/intel/vulkan/anv_extensions.py index 6fff293dee4..9e4e03e46df 100644 --- a/src/intel/vulkan/anv_extensions.py +++ b/src/intel/vulkan/anv_extensions.py @@ -129,6 +129,7 @@ EXTENSIONS = [ Extension('VK_EXT_inline_uniform_block', 1, True), Extension('VK_EXT_pci_bus_info', 2, True), Extension('VK_EXT_post_depth_coverage', 1, 'device->info.gen >= 9'), +Extension('VK_EXT_sample_locations', 1, False), Extension('VK_EXT_sampler_filter_minmax', 1, 'device->info.gen >= 9'), Extension('VK_EXT_scalar_block_layout', 1, True), Extension('VK_EXT_shader_stencil_export', 1, 'device->info.gen >= 9'), -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 5/9] anv: Added support for dynamic sample locations on Gen8+
Added support for setting the locations when the pipeline has been created with the dynamic state bit enabled according to the Vulkan Specification section [26.5. Custom Sample Locations] for the function: 'vkCmdSetSampleLocationsEXT' The reason that we preferred to store the boolean valid inside the dynamic state struct for locations instead of using a dirty bit (ANV_CMD_DIRTY_SAMPLE_LOCATIONS for example) is that other functions can modify the value of the dirty bits causing unexpected behavior. --- src/intel/vulkan/anv_cmd_buffer.c | 19 src/intel/vulkan/anv_genX.h| 6 +++- src/intel/vulkan/anv_private.h | 6 src/intel/vulkan/genX_cmd_buffer.c | 27 ++ src/intel/vulkan/genX_pipeline.c | 46 -- src/intel/vulkan/genX_state.c | 41 +++--- 6 files changed, 99 insertions(+), 46 deletions(-) diff --git a/src/intel/vulkan/anv_cmd_buffer.c b/src/intel/vulkan/anv_cmd_buffer.c index 1b34644a434..101c1375430 100644 --- a/src/intel/vulkan/anv_cmd_buffer.c +++ b/src/intel/vulkan/anv_cmd_buffer.c @@ -28,6 +28,7 @@ #include #include "anv_private.h" +#include "anv_sample_locations.h" #include "vk_format_info.h" #include "vk_util.h" @@ -558,6 +559,24 @@ void anv_CmdSetStencilReference( cmd_buffer->state.gfx.dirty |= ANV_CMD_DIRTY_DYNAMIC_STENCIL_REFERENCE; } +void +anv_CmdSetSampleLocationsEXT(VkCommandBuffer commandBuffer, + const VkSampleLocationsInfoEXT *pSampleLocationsInfo) +{ + ANV_FROM_HANDLE(anv_cmd_buffer, cmd_buffer, commandBuffer); + assert(pSampleLocationsInfo); + + struct anv_dynamic_state *dyn_state = _buffer->state.gfx.dynamic; + dyn_state->sample_locations.num_samples = + pSampleLocationsInfo->sampleLocationsPerPixel; + + anv_calc_sample_locations(dyn_state->sample_locations.anv_samples, + dyn_state->sample_locations.num_samples, + pSampleLocationsInfo); + + cmd_buffer->state.gfx.dynamic.sample_locations.valid = true; +} + static void anv_cmd_buffer_bind_descriptor_set(struct anv_cmd_buffer *cmd_buffer, VkPipelineBindPoint bind_point, diff --git a/src/intel/vulkan/anv_genX.h b/src/intel/vulkan/anv_genX.h index 52415c04a45..f84fe457152 100644 --- a/src/intel/vulkan/anv_genX.h +++ b/src/intel/vulkan/anv_genX.h @@ -89,7 +89,11 @@ void genX(cmd_buffer_mi_memset)(struct anv_cmd_buffer *cmd_buffer, void genX(blorp_exec)(struct blorp_batch *batch, const struct blorp_params *params); +void genX(emit_multisample)(struct anv_batch *batch, +uint32_t samples, +uint32_t log2_samples); + void genX(emit_sample_locations)(struct anv_batch *batch, + const struct anv_sample *anv_samples, uint32_t num_samples, - const VkSampleLocationsInfoEXT *sl, bool custom_locations); diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h index 981956e5706..a2e1756cd99 100644 --- a/src/intel/vulkan/anv_private.h +++ b/src/intel/vulkan/anv_private.h @@ -2135,6 +2135,12 @@ struct anv_dynamic_state { uint32_t front; uint32_t back; } stencil_reference; + + struct { + struct anv_sample anv_samples[MAX_SAMPLE_LOCATIONS]; + uint32_t num_samples; + bool valid; + } sample_locations; }; extern const struct anv_dynamic_state default_dynamic_state; diff --git a/src/intel/vulkan/genX_cmd_buffer.c b/src/intel/vulkan/genX_cmd_buffer.c index 7687507e6b7..9229df84caa 100644 --- a/src/intel/vulkan/genX_cmd_buffer.c +++ b/src/intel/vulkan/genX_cmd_buffer.c @@ -25,11 +25,13 @@ #include #include "anv_private.h" +#include "anv_sample_locations.h" #include "vk_format_info.h" #include "vk_util.h" #include "util/fast_idiv_by_const.h" #include "common/gen_l3_config.h" +#include "common/gen_sample_positions.h" #include "genxml/gen_macros.h" #include "genxml/genX_pack.h" @@ -2638,6 +2640,26 @@ cmd_buffer_flush_push_constants(struct anv_cmd_buffer *cmd_buffer, cmd_buffer->state.push_constants_dirty &= ~flushed; } +static void +cmd_buffer_emit_sample_locations(struct anv_cmd_buffer *cmd_buffer) +{ +#if GEN_GEN >= 8 + const struct anv_sample *anv_samples; + uint32_t log2_samples; + uint32_t samples; + + samples = cmd_buffer->state.gfx.dynamic.sample_locations.num_samples; + assert(samples > 0); + + log2_samples = __builtin_ffs(samples) - 1; + anv_samples = cmd_buffer->state.gfx.dynamic.sample_locations.anv_samples; + + genX(emit_multisample)(_buffer->batch, samples, log2_samples); + genX(emit_sample_locations)(_buffer->batch,
[Mesa-dev] [PATCH 4/9] anv: Added support for non-dynamic sample locations on Gen8+
Allowing the user to set custom sample locations non-dynamically, by filling the extension structs and chaining them to the pipeline structs according to the Vulkan specification section [26.5. Custom Sample Locations] for the following structures: 'VkPipelineSampleLocationsStateCreateInfoEXT' 'VkSampleLocationsInfoEXT' 'VkSampleLocationEXT' Once custom locations are used, the default locations are lost and need to be re-emitted again in the next pipeline creation. For that, we emit the 3DSTATE_SAMPLE_PATTERN at every pipeline creation. --- src/intel/common/gen_sample_positions.h | 53 src/intel/vulkan/anv_genX.h | 5 ++ src/intel/vulkan/anv_private.h | 9 +++ src/intel/vulkan/anv_sample_locations.c | 38 +++- src/intel/vulkan/anv_sample_locations.h | 29 + src/intel/vulkan/genX_pipeline.c| 80 + src/intel/vulkan/genX_state.c | 59 ++ 7 files changed, 259 insertions(+), 14 deletions(-) create mode 100644 src/intel/vulkan/anv_sample_locations.h diff --git a/src/intel/common/gen_sample_positions.h b/src/intel/common/gen_sample_positions.h index da48dcb5ed0..e8af2a552dc 100644 --- a/src/intel/common/gen_sample_positions.h +++ b/src/intel/common/gen_sample_positions.h @@ -160,4 +160,57 @@ prefix##14YOffset = 0.9375; \ prefix##15XOffset = 0.0625; \ prefix##15YOffset = 0.; +/* Examples: + * in case of GEN_GEN < 8: + * SET_SAMPLE_POS(ms.Sample, 0); expands to: + *ms.Sample0XOffset = anv_samples[0].offs_x; + *ms.Sample0YOffset = anv_samples[0].offs_y; + * + * in case of GEN_GEN >= 8: + * SET_SAMPLE_POS(sp._16xSample, 0); expands to: + *sp._16xSample0XOffset = anv_samples[0].offs_x; + *sp._16xSample0YOffset = anv_samples[0].offs_y; + */ +#define SET_SAMPLE_POS(prefix, sample_idx) \ +prefix##sample_idx##XOffset = anv_samples[sample_idx].offs_x; \ +prefix##sample_idx##YOffset = anv_samples[sample_idx].offs_y; + +#define SET_SAMPLE_POS_2X(prefix) \ +SET_SAMPLE_POS(prefix, 0); \ +SET_SAMPLE_POS(prefix, 1); + +#define SET_SAMPLE_POS_4X(prefix) \ +SET_SAMPLE_POS(prefix, 0); \ +SET_SAMPLE_POS(prefix, 1); \ +SET_SAMPLE_POS(prefix, 2); \ +SET_SAMPLE_POS(prefix, 3); + +#define SET_SAMPLE_POS_8X(prefix) \ +SET_SAMPLE_POS(prefix, 0); \ +SET_SAMPLE_POS(prefix, 1); \ +SET_SAMPLE_POS(prefix, 2); \ +SET_SAMPLE_POS(prefix, 3); \ +SET_SAMPLE_POS(prefix, 4); \ +SET_SAMPLE_POS(prefix, 5); \ +SET_SAMPLE_POS(prefix, 6); \ +SET_SAMPLE_POS(prefix, 7); + +#define SET_SAMPLE_POS_16X(prefix) \ +SET_SAMPLE_POS(prefix, 0); \ +SET_SAMPLE_POS(prefix, 1); \ +SET_SAMPLE_POS(prefix, 2); \ +SET_SAMPLE_POS(prefix, 3); \ +SET_SAMPLE_POS(prefix, 4); \ +SET_SAMPLE_POS(prefix, 5); \ +SET_SAMPLE_POS(prefix, 6); \ +SET_SAMPLE_POS(prefix, 7); \ +SET_SAMPLE_POS(prefix, 8); \ +SET_SAMPLE_POS(prefix, 9); \ +SET_SAMPLE_POS(prefix, 10); \ +SET_SAMPLE_POS(prefix, 11); \ +SET_SAMPLE_POS(prefix, 12); \ +SET_SAMPLE_POS(prefix, 13); \ +SET_SAMPLE_POS(prefix, 14); \ +SET_SAMPLE_POS(prefix, 15); + #endif /* GEN_SAMPLE_POSITIONS_H */ diff --git a/src/intel/vulkan/anv_genX.h b/src/intel/vulkan/anv_genX.h index 8fd32cabf1e..52415c04a45 100644 --- a/src/intel/vulkan/anv_genX.h +++ b/src/intel/vulkan/anv_genX.h @@ -88,3 +88,8 @@ void genX(cmd_buffer_mi_memset)(struct anv_cmd_buffer *cmd_buffer, void genX(blorp_exec)(struct blorp_batch *batch, const struct blorp_params *params); + +void genX(emit_sample_locations)(struct anv_batch *batch, + uint32_t num_samples, + const VkSampleLocationsInfoEXT *sl, + bool custom_locations); diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h index 5905299e59d..981956e5706 100644 --- a/src/intel/vulkan/anv_private.h +++ b/src/intel/vulkan/anv_private.h @@ -71,6 +71,7 @@ struct anv_buffer; struct anv_buffer_view; struct anv_image_view; struct anv_instance; +struct anv_sample; struct gen_l3_config; @@ -165,6 +166,7 @@ struct gen_l3_config; #define MAX_PUSH_DESCRIPTORS 32 /* Minimum requirement */ #define MAX_INLINE_UNIFORM_BLOCK_SIZE 4096 #define MAX_INLINE_UNIFORM_BLOCK_DESCRIPTORS 32 +#define MAX_SAMPLE_LOCATIONS 16 /* The kernel relocation API has a limitation of a 32-bit delta value * applied to the address before it is written which, in spite of it being @@ -2086,6 +2088,13 @@ struct anv_push_constants { struct brw_image_param images[MAX_GEN8_IMAGES]; }; +struct +anv_sample { + float offs_x; + float offs_y; + float radius; +}; + struct anv_dynamic_state { struct { uint32_t count; diff --git a/src/intel/vulkan/anv_sample_locations.c b/src/intel/vulkan/anv_sample_locations.c index 1ebf280e05b..c660cb5ae84 100644 --- a/src/intel/vulkan/anv_sample_locations.c +++ b/src/intel/vulkan/anv_sample_locations.c @@ -21,7 +21,7 @@ * IN THE SOFTWARE. */ -#include "anv_private.h"
[Mesa-dev] [PATCH 2/9] anv: Set the values for the VkPhysicalDeviceSampleLocationsPropertiesEXT
The VkPhysicalDeviceSampleLocationPropertiesEXT struct is filled with implementation dependent values and according to the table from the Vulkan Specification section [36.1. Limit Requirements]: pname | max | min pname:sampleLocationSampleCounts |-|ename:VK_SAMPLE_COUNT_4_BIT pname:maxSampleLocationGridSize|-|(1, 1) pname:sampleLocationCoordinateRange|(0.0, 0.9375)|(0.0, 0.9375) pname:sampleLocationSubPixelBits |-|4 pname:variableSampleLocations | false |implementation dependent The hardware only supports setting the same sample location for all the pixels, so we only support 1x1 grids. Also, variableSampleLocations is set to false because we don't support the feature. --- src/intel/vulkan/anv_device.c | 21 + src/intel/vulkan/anv_private.h | 3 +++ 2 files changed, 24 insertions(+) diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c index 729cceb3e32..1e183b7f4ad 100644 --- a/src/intel/vulkan/anv_device.c +++ b/src/intel/vulkan/anv_device.c @@ -1401,6 +1401,27 @@ void anv_GetPhysicalDeviceProperties2( break; } + case VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_SAMPLE_LOCATIONS_PROPERTIES_EXT: { + VkPhysicalDeviceSampleLocationsPropertiesEXT *props = +(VkPhysicalDeviceSampleLocationsPropertiesEXT *)ext; + props->sampleLocationSampleCounts = ISL_SAMPLE_COUNT_2_BIT | + ISL_SAMPLE_COUNT_4_BIT | + ISL_SAMPLE_COUNT_8_BIT; + if (pdevice->info.gen >= 9) +props->sampleLocationSampleCounts |= ISL_SAMPLE_COUNT_16_BIT; + + props->maxSampleLocationGridSize.width = SAMPLE_LOC_GRID_W; + props->maxSampleLocationGridSize.height = SAMPLE_LOC_GRID_H; + + props->sampleLocationCoordinateRange[0] = 0; + props->sampleLocationCoordinateRange[1] = 0.9375; + props->sampleLocationSubPixelBits = 4; + + props->variableSampleLocations = false; + + break; + } + default: anv_debug_ignored_stype(ext->sType); break; diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h index eed282ff985..5905299e59d 100644 --- a/src/intel/vulkan/anv_private.h +++ b/src/intel/vulkan/anv_private.h @@ -195,6 +195,9 @@ struct gen_l3_config; #define anv_printflike(a, b) __attribute__((__format__(__printf__, a, b))) +#define SAMPLE_LOC_GRID_W 1 +#define SAMPLE_LOC_GRID_H 1 + static inline uint32_t align_down_npot_u32(uint32_t v, uint32_t a) { -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 10/12] panfrost: Document "depth buffer writeback" bit
> If I comment the line immediately above, things work as expected. Interesting, okay, this is definitely a regression on kbase too. I'll investigate this evening -- I think I need to be checking the depth mask before setting this bit..? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 05/12] panfrost: Determine framebuffer format bits late
> Can we use a constant instead? The right solution is to actually RE the format bits for SFBD (which will be necessary if we're serious about supporting https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 8/8] panfrost: Add backend targeting the DRM driver
> I'm pretty sure it will be obsolete in only 240 years too. :) Ecclesiastes 1:9, alas :P ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 08/12] panfrost: Support linear depth textures
Reviewed-by: Tomeu Vizoso On Sun, 10 Mar 2019 at 07:50, Alyssa Rosenzweig wrote: > > This combination has not yet been seen "in the wild" in traces, but to > support linear depth FBOs, ~bruteforce reveals this bit pattern is > necessary. It's not yet clear why the meanings of 0x1 and 0x2 are > essentially flipped (tiled vs linear for colour, linear vs some sort of > tiled for depth). > > Signed-off-by: Alyssa Rosenzweig > --- > src/gallium/drivers/panfrost/pan_context.c | 6 -- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/src/gallium/drivers/panfrost/pan_context.c > b/src/gallium/drivers/panfrost/pan_context.c > index 9db667d8287..099d6d0389b 100644 > --- a/src/gallium/drivers/panfrost/pan_context.c > +++ b/src/gallium/drivers/panfrost/pan_context.c > @@ -2289,17 +2289,19 @@ panfrost_create_sampler_view( > > enum mali_format format = panfrost_find_format(desc); > > +bool is_depth = desc->format == PIPE_FORMAT_Z32_UNORM; > + > unsigned usage2_layout = 0x10; > > switch (prsrc->bo->layout) { > case PAN_AFBC: > -usage2_layout |= 0xc; > +usage2_layout |= 0x8 | 0x4; > break; > case PAN_TILED: > usage2_layout |= 0x1; > break; > case PAN_LINEAR: > -usage2_layout |= 0x2; > +usage2_layout |= is_depth ? 0x1 : 0x2; > break; > default: > assert(0); > -- > 2.20.1 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 07/12] panfrost: Allocate dedicated slab for linear BOs
Reviewed-by: Tomeu Vizoso On Sun, 10 Mar 2019 at 07:50, Alyssa Rosenzweig wrote: > > Previously, linear BOs shared memory with each other to minimize kernel > round-trips / latency, as well as to work around a bug in the free_slab > function. These concerns are invalid now, but continuing to use the slab > allocator for BOs resulted in memory allocation errors. This issue was > aggravated, though not introduced (so not a real regression) in the > previous commit. > > Signed-off-by: Alyssa Rosenzweig > --- > src/gallium/drivers/panfrost/pan_resource.c | 32 - > 1 file changed, 19 insertions(+), 13 deletions(-) > > diff --git a/src/gallium/drivers/panfrost/pan_resource.c > b/src/gallium/drivers/panfrost/pan_resource.c > index 39783f5a63a..0f11b8e5e38 100644 > --- a/src/gallium/drivers/panfrost/pan_resource.c > +++ b/src/gallium/drivers/panfrost/pan_resource.c > @@ -248,14 +248,16 @@ panfrost_create_bo(struct panfrost_screen *screen, > const struct pipe_resource *t > sz >>= 2; > } > } else { > -/* But for linear, we can! */ > +/* For a linear resource, allocate a block of memory from > + * kernel space */ > > -struct pb_slab_entry *entry = pb_slab_alloc(>slabs, > sz, HEAP_TEXTURE); > -struct panfrost_memory_entry *p_entry = (struct > panfrost_memory_entry *) entry; > -struct panfrost_memory *backing = (struct panfrost_memory *) > entry->slab; > -bo->entry[0] = p_entry; > -bo->cpu[0] = backing->cpu + p_entry->offset; > -bo->gpu[0] = backing->gpu + p_entry->offset; > +struct panfrost_memory mem; > + > +unsigned pages = ((sz + 4095) / 4096) * 2; > +screen->driver->allocate_slab(screen, , pages, true, 0, > 0, 0); > + > +bo->cpu[0] = mem.cpu; > +bo->gpu[0] = mem.gpu; > > /* TODO: Mipmap */ > } > @@ -325,12 +327,16 @@ panfrost_destroy_bo(struct panfrost_screen *screen, > struct panfrost_bo *pbo) > { > struct panfrost_bo *bo = (struct panfrost_bo *)pbo; > > -for (int l = 0; l < MAX_MIP_LEVELS; ++l) { > -if (bo->entry[l] != NULL) { > -/* Most allocations have an entry to free */ > -bo->entry[l]->freed = true; > -pb_slab_free(>slabs, >entry[l]->base); > -} > +if (bo->layout == PAN_LINEAR) { > +/* Construct a memory object for all mip levels */ > + > +struct panfrost_memory mem = { > +.cpu = bo->cpu[0], > +.gpu = bo->gpu[0], > +.size = bo->imported_size > +}; > + > +screen->driver->free_slab(screen, ); > } > > if (bo->layout == PAN_TILED) { > -- > 2.20.1 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Question] nir_intrinsic_load_uniform use float const_offset now
Hi guys, When doing rebase from 18.3 to master branch, I found nir_intrinsic_load_uniform use float const_offset now. But most gallium drivers still treat it as int except freedreno. I don't know which commit did this, is this expected? Regards, Qiang ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 8/8] panfrost: Add backend targeting the DRM driver
On Fri, Mar 8, 2019 at 3:59 PM Alyssa Rosenzweig wrote: > > > +/** > > + * struct drm_panfrost_wait_bo - ioctl argument for waiting for > > + * completion of the last DRM_PANFROST_SUBMIT_CL on a BO. > > Nit: Should be plain DRM_PANFROST_SUBMIT, there is no CL for us. > > > + __s64 timeout_ns; /* absolute */ > > Erm, why is this signed? Semantically, what does a negative timestamp > mean? Seems suspect. The comment /* absolute */ seems to underscore that > we really do want an unsigned value, perhaps ascribing a special meaning > to 0/~0 for "nonblocking" and "block indefinitely" if needed. Of course, > "(2^64)-1 ns" is essentially indefinite, so the latter need not be a > special case. Signed is convention and used internally in the kernel (ktime_t). I checked that signed is correct with Arnd Bergmann who is leading the 2038 work. > * It's 585 years, according to a back of the envelope calculation. > Panfrost will be obsolete many times over by the time that timeout > elapses ;) I'm pretty sure it will be obsolete in only 240 years too. :) Rob ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 109939] After upgrade mesa to 19.0.0 stop working the game Rise of the Tomb Raider
https://bugs.freedesktop.org/show_bug.cgi?id=109939 --- Comment #12 from Alex Smith --- Sounds like this is potentially a game issue. Could both of you zip up the contents of the preferences folder ("~/.local/share/feral-interactive/Rise of the Tomb Raider") once you're getting the issue, and send it to us at supp...@feralinteractive.com so that we can investigate? Also, Mikhail - since Timur said that clearing out that folder (except for the save games) makes the problem go away temporarily, could you try doing so as well to see if that helps? Please make a copy of the old contents to send to us before trying. Thanks! -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 06/12] panfrost: Refactor layout selection (BO creation)
On Sun, 10 Mar 2019 at 07:50, Alyssa Rosenzweig wrote: > > With a unified layout field, we can specify the logic for BO layout > explicitly, deciding between linear/tiled/AFBC based on the specified > usage/binding flags. > > Signed-off-by: Alyssa Rosenzweig Great stuff! Reviewed-by: Tomeu Vizoso > --- > src/gallium/drivers/panfrost/pan_resource.c | 64 ++--- > 1 file changed, 44 insertions(+), 20 deletions(-) > > diff --git a/src/gallium/drivers/panfrost/pan_resource.c > b/src/gallium/drivers/panfrost/pan_resource.c > index 0cebbdb6e51..39783f5a63a 100644 > --- a/src/gallium/drivers/panfrost/pan_resource.c > +++ b/src/gallium/drivers/panfrost/pan_resource.c > @@ -181,6 +181,48 @@ panfrost_surface_destroy(struct pipe_context *pipe, > free(surf); > } > > +/* Based on the usage, figure out what storing will be used. There are > + * various tradeoffs: > + * > + * Linear: the basic format, bad for memory bandwidth, bad for cache > + * use. Zero-copy, though. Renderable. > + * > + * Tiled: Not compressed, but cache-optimized. Expensive to write into > + * (due to software tiling), but cheap to sample from. Ideal for most > + * textures. > + * > + * AFBC: Compressed and renderable (so always desirable for non-scanout > + * rendertargets). Cheap to sample from. The format is black box, so we > + * can't read/write from software. > + */ > + > +static enum panfrost_memory_layout > +panfrost_best_layout(const struct pipe_resource *rsrc) > +{ > +/* For streaming, optimize for CPU write since it's one-use */ > + > +if (rsrc->usage == PIPE_USAGE_STREAM) > +return PAN_LINEAR; > + > +/* Legal formats depends if we're renderable */ > + > +unsigned renderable_bind = > +PIPE_BIND_DEPTH_STENCIL | > +PIPE_BIND_RENDER_TARGET | > +PIPE_BIND_BLENDABLE; > + > +if (rsrc->bind & renderable_bind) { > +/* TODO: AFBC */ > +return PAN_LINEAR; > +} else if (rsrc->bind & PIPE_BIND_SAMPLER_VIEW) { > +return PAN_TILED; > +} > + > +/* If all else fails, we default to linear */ > + > +return PAN_LINEAR; > +} > + > static struct panfrost_bo * > panfrost_create_bo(struct panfrost_screen *screen, const struct > pipe_resource *template) > { > @@ -195,26 +237,8 @@ panfrost_create_bo(struct panfrost_screen *screen, const > struct pipe_resource *t > if (template->height0) sz *= template->height0; > if (template->depth0) sz *= template->depth0; > > -/* Based on the usage, figure out what storing will be used. There > are > - * various tradeoffs: > - * > - * Linear: the basic format, bad for memory bandwidth, bad for cache > - * use. Zero-copy, though. Renderable. > - * > - * Tiled: Not compressed, but cache-optimized. Expensive to write > into > - * (due to software tiling), but cheap to sample from. Ideal for most > - * textures. > - * > - * AFBC: Compressed and renderable (so always desirable for > non-scanout > - * rendertargets). Cheap to sample from. The format is black box, so > we > - * can't read/write from software. > - */ > - > -/* Tiling textures is almost always faster, unless we only use it > once */ > -bool should_tile = (template->usage != PIPE_USAGE_STREAM) && > (template->bind & PIPE_BIND_SAMPLER_VIEW); > - > -/* Set the layout appropriately */ > -bo->layout = should_tile ? PAN_TILED : PAN_LINEAR; > +bo->imported_size = sz; > +bo->layout = panfrost_best_layout(template); > > if (bo->layout == PAN_TILED) { > /* For tiled, we don't map directly, so just malloc any old > buffer */ > -- > 2.20.1 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 05/12] panfrost: Determine framebuffer format bits late
On Sun, 10 Mar 2019 at 07:50, Alyssa Rosenzweig wrote: > > Again, these formats are only properly known at the time of fragment job > emit. Rather than hardcoding the format, at least for MFBD we begin to > construct the format bits on-demand. This cleans up the code, > futureproofs for ES3 framebuffer formats, and should fix bugs regarding > FBO colour swizzles. > > Signed-off-by: Alyssa Rosenzweig > --- > src/gallium/drivers/panfrost/pan_context.c | 59 +++--- > 1 file changed, 42 insertions(+), 17 deletions(-) > > diff --git a/src/gallium/drivers/panfrost/pan_context.c > b/src/gallium/drivers/panfrost/pan_context.c > index d54b3df5962..9db667d8287 100644 > --- a/src/gallium/drivers/panfrost/pan_context.c > +++ b/src/gallium/drivers/panfrost/pan_context.c > @@ -143,6 +143,35 @@ panfrost_enable_checksum(struct panfrost_context *ctx, > struct panfrost_resource > rsrc->bo->has_checksum = true; > } > > +static unsigned > +panfrost_sfbd_format_for_surface(struct pipe_surface *surf) > +{ > +/* TODO */ > +return 0xb84e0281; /* RGB32, no MSAA */ Can we use a constant instead? In any case, Reviewed-by: Tomeu Vizoso Thanks, Tomeu > +} > + > +static struct mali_rt_format > +panfrost_mfbd_format_for_surface(struct pipe_surface *surf) > +{ > +/* Explode details on the format */ > + > +const struct util_format_description *desc = > +util_format_description(surf->texture->format); > + > +/* Fill in accordingly */ > + > +struct mali_rt_format fmt = { > +.unk1 = 0x400, > +.unk2 = 0x1, > +.nr_channels = MALI_POSITIVE(desc->nr_channels), > +.flags = 0x444, > +.swizzle = panfrost_translate_swizzle_4(desc->swizzle), > +.unk4 = 0x8 > +}; > + > +return fmt; > +} > + > static bool panfrost_is_scanout(struct panfrost_context *ctx); > > /* These routines link a fragment job with the bound surface, accounting for > the > @@ -159,6 +188,18 @@ panfrost_set_fragment_target_cbuf( > signed stride = > util_format_get_stride(surf->format, surf->texture->width0); > > +/* First, we set the format bits */ > + > +if (require_sfbd) { > +ctx->fragment_sfbd.format = > +panfrost_sfbd_format_for_surface(surf); > +} else { > +ctx->fragment_rts[cb].format = > +panfrost_mfbd_format_for_surface(surf); > +} > + > +/* Now, we set the layout specific pieces */ > + > if (rsrc->bo->layout == PAN_LINEAR) { > mali_ptr framebuffer = rsrc->bo->gpu[0]; > > @@ -392,7 +433,6 @@ panfrost_new_frag_framebuffer(struct panfrost_context > *ctx) > { > if (require_sfbd) { > struct mali_single_framebuffer fb = panfrost_emit_sfbd(ctx); > -fb.format = 0xb84e0281; /* RGB32, no MSAA */ > memcpy(>fragment_sfbd, , sizeof(fb)); > } else { > struct bifrost_framebuffer fb = panfrost_emit_mfbd(ctx); > @@ -401,24 +441,9 @@ panfrost_new_frag_framebuffer(struct panfrost_context > *ctx) > fb.rt_count_2 = 1; > fb.unk3 = 0x100; > > -/* By default, Gallium seems to need a BGR framebuffer */ > -unsigned char bgra[4] = { > -PIPE_SWIZZLE_Z, PIPE_SWIZZLE_Y, PIPE_SWIZZLE_X, > PIPE_SWIZZLE_W > -}; > - > -struct bifrost_render_target rt = { > -.format = { > -.unk1 = 0x400, > -.unk2 = 0x1, > -.nr_channels = MALI_POSITIVE(4), > -.flags = 0x444, > -.swizzle = > panfrost_translate_swizzle_4(bgra), > -.unk4 = 0x8 > -}, > -}; > +struct bifrost_render_target rt = {}; > > memcpy(>fragment_rts[0], , sizeof(rt)); > - > memset(>fragment_extra, 0, sizeof(ctx->fragment_extra)); > memcpy(>fragment_mfbd, , sizeof(fb)); > } > -- > 2.20.1 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 04/12] panfrost: Cleanup zsbuf emit in fragment job
On Sun, 10 Mar 2019 at 07:50, Alyssa Rosenzweig wrote: > > This emit is only implemented for AFBC depth/stencil buffers, so > conceptually there is no change here, but we follow the style of the > previous patch to improve robustness and clarity. > > Signed-off-by: Alyssa Rosenzweig > --- > src/gallium/drivers/panfrost/pan_context.c | 71 +- > 1 file changed, 42 insertions(+), 29 deletions(-) > > diff --git a/src/gallium/drivers/panfrost/pan_context.c > b/src/gallium/drivers/panfrost/pan_context.c > index 71634c781a3..d54b3df5962 100644 > --- a/src/gallium/drivers/panfrost/pan_context.c > +++ b/src/gallium/drivers/panfrost/pan_context.c > @@ -199,46 +199,61 @@ panfrost_set_fragment_target_cbuf( > } > > static void > -panfrost_set_fragment_target(struct panfrost_context *ctx) > +panfrost_set_fragment_target_zsbuf( > +struct panfrost_context *ctx, > +struct pipe_surface *surf) > { > -for (int cb = 0; cb < ctx->pipe_framebuffer.nr_cbufs; ++cb) { > -struct pipe_surface *surf = ctx->pipe_framebuffer.cbufs[cb]; > -panfrost_set_fragment_target_cbuf(ctx, surf, cb); > -} > +struct panfrost_resource *rsrc = pan_resource(surf->texture); > > -/* Enable depth/stencil AFBC for the framebuffer (not the render > target) */ > -if (ctx->pipe_framebuffer.zsbuf) { > -struct panfrost_resource *rsrc = (struct panfrost_resource > *) ctx->pipe_framebuffer.zsbuf->texture; > +if (rsrc->bo->layout == PAN_AFBC) { > +/* TODO: AFBC on SFBD */ > +assert(!require_sfbd); > > -if (rsrc->bo->layout == PAN_AFBC) { > -if (require_sfbd) { > -fprintf(stderr, "Depth AFBC not supported on > SFBD\n"); > -assert(0); > -} > +ctx->fragment_mfbd.unk3 |= MALI_MFBD_EXTRA; > > -ctx->fragment_mfbd.unk3 |= MALI_MFBD_EXTRA; > +ctx->fragment_extra.ds_afbc.depth_stencil_afbc_metadata = > rsrc->bo->afbc_slab.gpu; > +ctx->fragment_extra.ds_afbc.depth_stencil_afbc_stride = 0; > > - > ctx->fragment_extra.ds_afbc.depth_stencil_afbc_metadata = > rsrc->bo->afbc_slab.gpu; > - > ctx->fragment_extra.ds_afbc.depth_stencil_afbc_stride = 0; > +ctx->fragment_extra.ds_afbc.depth_stencil = > rsrc->bo->afbc_slab.gpu + rsrc->bo->afbc_metadata_size; > > -ctx->fragment_extra.ds_afbc.depth_stencil = > rsrc->bo->afbc_slab.gpu + rsrc->bo->afbc_metadata_size; > +ctx->fragment_extra.ds_afbc.zero1 = 0x10009; > +ctx->fragment_extra.ds_afbc.padding = 0x1000; > > -ctx->fragment_extra.ds_afbc.zero1 = 0x10009; > -ctx->fragment_extra.ds_afbc.padding = 0x1000; > +/* There's a general 0x400 in all versions of this field > scene. > + * ORed with 0x5 for depth/stencil. ORed 0x10 for AFBC > encoded > + * depth stencil. It's unclear where the remaining 0x20 bit > is > + * from */ > > -ctx->fragment_extra.unk = 0x435; /* General 0x400 in > all unks. 0x5 for depth/stencil. 0x10 for AFBC encoded depth stencil. Unclear > where the 0x20 is from */ > +ctx->fragment_extra.unk = 0x400 | 0x20 | 0x10 | 0x5; > > -ctx->fragment_mfbd.unk3 |= 0x400; The patch looks good to me, but this line is causing failures here with the DRM driver (haven't tested with Arm's). Regards, Tomeu > -} > +ctx->fragment_mfbd.unk3 |= 0x400; > +} else if (rsrc->bo->layout == PAN_LINEAR) { > +/* TODO */ > +} else { > +fprintf(stderr, "Invalid render layout (zsbuf)"); > +assert(0); > } > +} > > -/* For the special case of a depth-only FBO, we need to attach a > dummy render target */ > +static void > +panfrost_set_fragment_target(struct panfrost_context *ctx) > +{ > +for (int cb = 0; cb < ctx->pipe_framebuffer.nr_cbufs; ++cb) { > +struct pipe_surface *surf = ctx->pipe_framebuffer.cbufs[cb]; > +panfrost_set_fragment_target_cbuf(ctx, surf, cb); > +} > + > +if (ctx->pipe_framebuffer.zsbuf) { > +struct pipe_surface *surf = ctx->pipe_framebuffer.zsbuf; > +panfrost_set_fragment_target_zsbuf(ctx, surf); > + > + } > + > +/* There must always be at least one render-target, so attach a dummy > + * if necessary */ > > if (ctx->pipe_framebuffer.nr_cbufs == 0) { > -if (require_sfbd) { > -fprintf(stderr, "Depth-only FBO not supported on > SFBD\n"); > -assert(0); > -
Re: [Mesa-dev] [PATCH 03/12] panfrost: Delay color buffer setup
Reviewed-by: Tomeu Vizoso On Sun, 10 Mar 2019 at 07:50, Alyssa Rosenzweig wrote: > > In an effort to cleanup framebuffer management code, we delay > colour buffer setup until the FRAGMENT job is actually emitted, allowing > the AFBC and linear codepaths to be unified. > > Signed-off-by: Alyssa Rosenzweig > --- > src/gallium/drivers/panfrost/pan_context.c | 93 -- > 1 file changed, 50 insertions(+), 43 deletions(-) > > diff --git a/src/gallium/drivers/panfrost/pan_context.c > b/src/gallium/drivers/panfrost/pan_context.c > index 630c41fbf1e..71634c781a3 100644 > --- a/src/gallium/drivers/panfrost/pan_context.c > +++ b/src/gallium/drivers/panfrost/pan_context.c > @@ -143,36 +143,67 @@ panfrost_enable_checksum(struct panfrost_context *ctx, > struct panfrost_resource > rsrc->bo->has_checksum = true; > } > > -/* ..by contrast, this routine runs for every FRAGMENT job, but does no > - * allocation. AFBC is enabled on a per-surface basis */ > +static bool panfrost_is_scanout(struct panfrost_context *ctx); > + > +/* These routines link a fragment job with the bound surface, accounting for > the > + * BO layout. This routine runs per-frame */ > > static void > -panfrost_set_fragment_afbc(struct panfrost_context *ctx) > +panfrost_set_fragment_target_cbuf( > +struct panfrost_context *ctx, > +struct pipe_surface *surf, > +unsigned cb) > { > -for (int cb = 0; cb < ctx->pipe_framebuffer.nr_cbufs; ++cb) { > -struct panfrost_resource *rsrc = (struct panfrost_resource > *) ctx->pipe_framebuffer.cbufs[cb]->texture; > +struct panfrost_resource *rsrc = pan_resource(surf->texture); > > -/* Non-AFBC is the default */ > -if (rsrc->bo->layout != PAN_AFBC) > -continue; > +signed stride = > +util_format_get_stride(surf->format, surf->texture->width0); > + > +if (rsrc->bo->layout == PAN_LINEAR) { > +mali_ptr framebuffer = rsrc->bo->gpu[0]; > + > +/* The default is upside down from OpenGL's perspective. */ > +if (panfrost_is_scanout(ctx)) { > +framebuffer += stride * (surf->texture->height0 - 1); > +stride = -stride; > +} > > if (require_sfbd) { > -fprintf(stderr, "Color AFBC not supported on > SFBD\n"); > -assert(0); > +ctx->fragment_sfbd.framebuffer = framebuffer; > +ctx->fragment_sfbd.stride = stride; > +} else { > +/* MFBD specifies stride in tiles */ > +ctx->fragment_rts[cb].framebuffer = framebuffer; > +ctx->fragment_rts[cb].framebuffer_stride = stride / > 16; > } > +} else if (rsrc->bo->layout == PAN_AFBC) { > +/* TODO: AFBC on SFBD */ > +assert(!require_sfbd); > > /* Enable AFBC for the render target */ > -ctx->fragment_rts[0].afbc.metadata = rsrc->bo->afbc_slab.gpu; > -ctx->fragment_rts[0].afbc.stride = 0; > -ctx->fragment_rts[0].afbc.unk = 0x30009; > +ctx->fragment_rts[cb].afbc.metadata = > rsrc->bo->afbc_slab.gpu; > +ctx->fragment_rts[cb].afbc.stride = 0; > +ctx->fragment_rts[cb].afbc.unk = 0x30009; > + > +ctx->fragment_rts[cb].format.flags |= MALI_MFBD_FORMAT_AFBC; > > -ctx->fragment_rts[0].format.flags |= MALI_MFBD_FORMAT_AFBC; > +mali_ptr afbc_main = rsrc->bo->afbc_slab.gpu + > rsrc->bo->afbc_metadata_size; > +ctx->fragment_rts[cb].framebuffer = afbc_main; > > -/* Point rendering to our special framebuffer */ > -ctx->fragment_rts[0].framebuffer = rsrc->bo->afbc_slab.gpu + > rsrc->bo->afbc_metadata_size; > +/* TODO: Investigate shift */ > +ctx->fragment_rts[cb].framebuffer_stride = stride << 1; > +} else { > +fprintf(stderr, "Invalid render layout (cbuf %d)", cb); > +assert(0); > +} > +} > > -/* WAT? Stride is diff from the scanout case */ > -ctx->fragment_rts[0].framebuffer_stride = > ctx->pipe_framebuffer.width * 2 * 4; > +static void > +panfrost_set_fragment_target(struct panfrost_context *ctx) > +{ > +for (int cb = 0; cb < ctx->pipe_framebuffer.nr_cbufs; ++cb) { > +struct pipe_surface *surf = ctx->pipe_framebuffer.cbufs[cb]; > +panfrost_set_fragment_target_cbuf(ctx, surf, cb); > } > > /* Enable depth/stencil AFBC for the framebuffer (not the render > target) */ > @@ -346,30 +377,8 @@ panfrost_is_scanout(struct panfrost_context *ctx) > static void >
Re: [Mesa-dev] [PATCH 02/12] panfrost: Combine has_afbc/tiled in layout enum
On Sun, 10 Mar 2019 at 07:50, Alyssa Rosenzweig wrote: > > AFBC, tiled, and linear BO layouts are mutually exclusive; they should > be coupled via a single enum rather than ad hoc checks of booleans. > > Signed-off-by: Alyssa Rosenzweig Reviewed-by: Tomeu Vizoso > --- > src/gallium/drivers/panfrost/pan_context.c | 33 ++-- > src/gallium/drivers/panfrost/pan_resource.c | 34 - > src/gallium/drivers/panfrost/pan_resource.h | 20 +--- > 3 files changed, 64 insertions(+), 23 deletions(-) > > diff --git a/src/gallium/drivers/panfrost/pan_context.c > b/src/gallium/drivers/panfrost/pan_context.c > index 4c41969fd05..630c41fbf1e 100644 > --- a/src/gallium/drivers/panfrost/pan_context.c > +++ b/src/gallium/drivers/panfrost/pan_context.c > @@ -119,7 +119,7 @@ panfrost_enable_afbc(struct panfrost_context *ctx, struct > panfrost_resource *rsr > (rsrc->bo->afbc_metadata_size + main_size + > 4095) / 4096, > true, 0, 0, 0); > > -rsrc->bo->has_afbc = true; > +rsrc->bo->layout = PAN_AFBC; > > /* Compressed textured reads use a tagged pointer to the metadata */ > > @@ -153,7 +153,7 @@ panfrost_set_fragment_afbc(struct panfrost_context *ctx) > struct panfrost_resource *rsrc = (struct panfrost_resource > *) ctx->pipe_framebuffer.cbufs[cb]->texture; > > /* Non-AFBC is the default */ > -if (!rsrc->bo->has_afbc) > +if (rsrc->bo->layout != PAN_AFBC) > continue; > > if (require_sfbd) { > @@ -179,7 +179,7 @@ panfrost_set_fragment_afbc(struct panfrost_context *ctx) > if (ctx->pipe_framebuffer.zsbuf) { > struct panfrost_resource *rsrc = (struct panfrost_resource > *) ctx->pipe_framebuffer.zsbuf->texture; > > -if (rsrc->bo->has_afbc) { > +if (rsrc->bo->layout == PAN_AFBC) { > if (require_sfbd) { > fprintf(stderr, "Depth AFBC not supported on > SFBD\n"); > assert(0); > @@ -2244,6 +2244,23 @@ panfrost_create_sampler_view( > > enum mali_format format = panfrost_find_format(desc); > > +unsigned usage2_layout = 0x10; > + > +switch (prsrc->bo->layout) { > +case PAN_AFBC: > +usage2_layout |= 0xc; > +break; > +case PAN_TILED: > +usage2_layout |= 0x1; > +break; > +case PAN_LINEAR: > +usage2_layout |= 0x2; > +break; > +default: > +assert(0); > +break; > +} > + > struct mali_texture_descriptor texture_descriptor = { > .width = MALI_POSITIVE(texture->width0), > .height = MALI_POSITIVE(texture->height0), > @@ -2257,11 +2274,7 @@ panfrost_create_sampler_view( > .usage1 = 0x0, > .is_not_cubemap = 1, > > -/* 0x11 - regular texture 2d, uncompressed tiled */ > -/* 0x12 - regular texture 2d, uncompressed linear */ > -/* 0x1c - AFBC compressed (internally tiled, > probably) texture 2D */ > - > -.usage2 = prsrc->bo->has_afbc ? 0x1c : > (prsrc->bo->tiled ? 0x11 : 0x12), > +.usage2 = usage2_layout > }, > > .swizzle = panfrost_translate_swizzle_4(user_swizzle) > @@ -2353,7 +2366,7 @@ panfrost_set_framebuffer_state(struct pipe_context > *pctx, > struct panfrost_resource *tex = ((struct panfrost_resource > *) ctx->pipe_framebuffer.cbufs[i]->texture); > bool is_scanout = panfrost_is_scanout(ctx); > > -if (!is_scanout && !tex->bo->has_afbc) { > +if (!is_scanout && tex->bo->layout != PAN_AFBC) { > /* The blob is aggressive about enabling AFBC. As > such, > * it's pretty much necessary to use it here, since > we > * have no traces of non-compressed FBO. */ > @@ -2387,7 +2400,7 @@ panfrost_set_framebuffer_state(struct pipe_context > *pctx, > > struct panfrost_resource *tex = ((struct > panfrost_resource *) ctx->pipe_framebuffer.zsbuf->texture); > > -if (!tex->bo->has_afbc && > !panfrost_is_scanout(ctx)) > +if (tex->bo->layout != PAN_AFBC && > !panfrost_is_scanout(ctx)) > panfrost_enable_afbc(ctx, tex, true); > } > } > diff --git a/src/gallium/drivers/panfrost/pan_resource.c >
Re: [Mesa-dev] [PATCH 01/12] panfrost: Cleanup needless if in create_bo
On Sun, 10 Mar 2019 at 07:50, Alyssa Rosenzweig wrote: > > I'm not sure why we were checking for these additional criteria (likely > inherited from some other driver); remove the needless checks to cleanup > the code and perhaps fix some bugs down the line. > > Signed-off-by: Alyssa Rosenzweig Reviewed-by: Tomeu Vizoso > --- > src/gallium/drivers/panfrost/pan_resource.c | 56 ++--- > 1 file changed, 26 insertions(+), 30 deletions(-) > > diff --git a/src/gallium/drivers/panfrost/pan_resource.c > b/src/gallium/drivers/panfrost/pan_resource.c > index f26f33db96b..7dfeb773d8b 100644 > --- a/src/gallium/drivers/panfrost/pan_resource.c > +++ b/src/gallium/drivers/panfrost/pan_resource.c > @@ -246,37 +246,33 @@ panfrost_resource_create(struct pipe_screen *screen, > assert(0); > } > > -if ((template->bind & PIPE_BIND_RENDER_TARGET) || (template->bind & > PIPE_BIND_DEPTH_STENCIL)) { > -if (template->bind & PIPE_BIND_DISPLAY_TARGET || > -template->bind & PIPE_BIND_SCANOUT || > -template->bind & PIPE_BIND_SHARED) { > -struct pipe_resource scanout_templat = *template; > -struct renderonly_scanout *scanout; > -struct winsys_handle handle; > - > -/* TODO: align width0 and height0? */ > - > -scanout = > renderonly_scanout_for_resource(_templat, > - > pscreen->ro, ); > -if (!scanout) > -return NULL; > - > -assert(handle.type == WINSYS_HANDLE_TYPE_FD); > -/* TODO: handle modifiers? */ > -so = > pan_resource(screen->resource_from_handle(screen, template, > - > , > - > PIPE_HANDLE_USAGE_FRAMEBUFFER_WRITE)); > -close(handle.handle); > -if (!so) > -return NULL; > - > -so->scanout = scanout; > -pscreen->display_target = so; > -} else { > - so->bo = panfrost_create_bo(pscreen, template); > -} > +if (template->bind & PIPE_BIND_DISPLAY_TARGET || > +template->bind & PIPE_BIND_SCANOUT || > +template->bind & PIPE_BIND_SHARED) { > +struct pipe_resource scanout_templat = *template; > +struct renderonly_scanout *scanout; > +struct winsys_handle handle; > + > +/* TODO: align width0 and height0? */ > + > +scanout = renderonly_scanout_for_resource(_templat, > + pscreen->ro, > ); > +if (!scanout) > +return NULL; > + > +assert(handle.type == WINSYS_HANDLE_TYPE_FD); > +/* TODO: handle modifiers? */ > +so = pan_resource(screen->resource_from_handle(screen, > template, > + , > + > PIPE_HANDLE_USAGE_FRAMEBUFFER_WRITE)); > +close(handle.handle); > +if (!so) > +return NULL; > + > +so->scanout = scanout; > +pscreen->display_target = so; > } else { > - so->bo = panfrost_create_bo(pscreen, template); > +so->bo = panfrost_create_bo(pscreen, template); > } > > return (struct pipe_resource *)so; > -- > 2.20.1 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 109958] Xorg memory leak when using llvmpipe/softpipe
https://bugs.freedesktop.org/show_bug.cgi?id=109958 --- Comment #1 from joney --- After searching a little bit more I just found https://bugs.freedesktop.org/show_bug.cgi?id=109641 which looks like it might be the same issue. -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] panfrost: Set bo->imported_size in the DRM backend
Signed-off-by: Tomeu Vizoso --- src/gallium/drivers/panfrost/pan_drm.c | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/src/gallium/drivers/panfrost/pan_drm.c b/src/gallium/drivers/panfrost/pan_drm.c index 6d1129ff5f2b..62a7b0ce5a30 100644 --- a/src/gallium/drivers/panfrost/pan_drm.c +++ b/src/gallium/drivers/panfrost/pan_drm.c @@ -125,7 +125,7 @@ panfrost_drm_import_bo(struct panfrost_screen *screen, struct winsys_handle *wha struct panfrost_drm *drm = (struct panfrost_drm *)screen->driver; struct drm_panfrost_get_bo_offset get_bo_offset = {0,}; struct drm_panfrost_mmap_bo mmap_bo = {0,}; -int ret, size; +int ret; unsigned gem_handle; ret = drmPrimeFDToHandle(drm->fd, whandle->handle, _handle); @@ -146,9 +146,9 @@ panfrost_drm_import_bo(struct panfrost_screen *screen, struct winsys_handle *wha assert(0); } - size = lseek(whandle->handle, 0, SEEK_END); - assert(size > 0); -bo->cpu[0] = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_SHARED, +bo->imported_size = lseek(whandle->handle, 0, SEEK_END); +assert(bo->imported_size > 0); +bo->cpu[0] = mmap(NULL, bo->imported_size, PROT_READ | PROT_WRITE, MAP_SHARED, drm->fd, mmap_bo.offset); if (bo->cpu[0] == MAP_FAILED) { fprintf(stderr, "mmap failed: %p\n", bo->cpu[0]); @@ -156,7 +156,7 @@ panfrost_drm_import_bo(struct panfrost_screen *screen, struct winsys_handle *wha } /* Record the mmap if we're tracing */ -pantrace_mmap(bo->gpu[0], bo->cpu[0], size, NULL); +pantrace_mmap(bo->gpu[0], bo->cpu[0], bo->imported_size, NULL); return bo; } -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] panfrost: couple of fixes
Hi, I needed the below two patches to avoid regressions when testing this series. Thanks, Tomeu ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] panfrost: Set bo->gem_handle when creating a linear BO
Signed-off-by: Tomeu Vizoso --- src/gallium/drivers/panfrost/pan_resource.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/src/gallium/drivers/panfrost/pan_resource.c b/src/gallium/drivers/panfrost/pan_resource.c index 1a6769b5508e..e398b6f6b7ce 100644 --- a/src/gallium/drivers/panfrost/pan_resource.c +++ b/src/gallium/drivers/panfrost/pan_resource.c @@ -260,6 +260,7 @@ panfrost_create_bo(struct panfrost_screen *screen, const struct pipe_resource *t bo->cpu[0] = mem.cpu; bo->gpu[0] = mem.gpu; +bo->gem_handle = mem.gem_handle; /* TODO: Mipmap */ } @@ -335,6 +336,7 @@ panfrost_destroy_bo(struct panfrost_screen *screen, struct panfrost_bo *pbo) struct panfrost_memory mem = { .cpu = bo->cpu[0], .gpu = bo->gpu[0], +.gem_handle = bo->gem_handle, .size = bo->imported_size }; -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev