Re: [Mesa-dev] [RFC PATCH 0/6] r600: speed up tesselation shaders

2018-01-07 Thread Dave Airlie
On 6 January 2018 at 03:41, Gert Wollny  wrote:
> Am Freitag, den 05.01.2018, 18:18 +0100 schrieb Gert Wollny:
>>
>> Well, I have tested some piglits now and the behaviour is quite
>> wired:
>>
>> When I run nop as the very first piglit after booting the machine it
>> works. After running other piglits (specifically  tcs-input-read-
>> array-interface and tcs-input-read-mat), nop starts to fail, also
>> without sb.
>>
>> Restarting X is not enough to get nop to pass again.
>>
>> If I run piglit normally on the shader subset, I also get lockups and
>> I even got kicked out of X, the last syslog message related to this
>> was:
>>
>> [ 1403.211887] [drm:r600_ib_test [radeon]] *ERROR* radeon: fence wait
>> timed out.
>> [ 1403.211932] [drm:radeon_ib_ring_tests [radeon]] *ERROR* radeon:
>> failed testing IB on GFX ring (-110).
>>
>
> When I run Unigine_Heaven with your WIP code and all sb passes for
> tesselation enabled, I get a crash because of a stack overflow, i.e.
> the hash evaluation ends up in an infinite recursion doing a ping-pong
> between two nodes:
>
> ...
> #747 in r600_sb::node::hash (this=0x1e01228) at sb/sb_ir.cpp:277
> #748 in r600_sb::value::hash (this=0x1e39cd0) at sb/sb_valtable.cpp:189
> #749 in r600_sb::value::hash (this=< >) at sb/sb_valtable.cpp:184
> #750 in r600_sb::node::hash_src (this=this@entry= ) at sb/sb_ir.cpp:265
> #751 in r600_sb::node::hash (this=0x1e00bf0) at sb/sb_ir.cpp:277
> #752 in r600_sb::value::hash (this=0x1e39e70) at sb/sb_valtable.cpp:189
> #753 in r600_sb::value::hash (this=< >) at sb/sb_valtable.cpp:184
> #754 in r600_sb::node::hash_src (this=this@entry= ) at sb/sb_ir.cpp:265
> #755 in r600_sb::node::hash (this=0x1e01228) at sb/sb_ir.cpp:277

Yeah I see the same. Not 100% sure why yet.

For nop.shader_test I've noticed if you move the position line above the
tess factor emission things start to work, which is confusing me no end,
it's sounds like we doing something bad with LDS still.

Dave.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] ac: rework emit_barrier() to not segfault on radeonsi

2018-01-07 Thread Dave Airlie
On 8 January 2018 at 16:45, Timothy Arceri  wrote:
> nir_to_llvm_context will always be NULL for radeonsi so we need
> work around this.

Reviewed-by: Dave Airlie 

> ---
>  src/amd/common/ac_nir_to_llvm.c | 17 -
>  1 file changed, 8 insertions(+), 9 deletions(-)
>
> diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
> index 5203b78537..a729fe5f6d 100644
> --- a/src/amd/common/ac_nir_to_llvm.c
> +++ b/src/amd/common/ac_nir_to_llvm.c
> @@ -3826,19 +3826,18 @@ static void emit_membar(struct nir_to_llvm_context 
> *ctx,
> ac_build_waitcnt(>ac, waitcnt);
>  }
>
> -static void emit_barrier(struct nir_to_llvm_context *ctx)
> +static void emit_barrier(struct ac_llvm_context *ac, gl_shader_stage stage)
>  {
> /* SI only (thanks to a hw bug workaround):
>  * The real barrier instruction isn’t needed, because an entire patch
>  * always fits into a single wave.
>  */
> -   if (ctx->options->chip_class == SI &&
> -   ctx->stage == MESA_SHADER_TESS_CTRL) {
> -   ac_build_waitcnt(>ac, LGKM_CNT & VM_CNT);
> +   if (ac->chip_class == SI && stage == MESA_SHADER_TESS_CTRL) {
> +   ac_build_waitcnt(ac, LGKM_CNT & VM_CNT);
> return;
> }
> -   ac_build_intrinsic(>ac, "llvm.amdgcn.s.barrier",
> -  ctx->ac.voidt, NULL, 0, AC_FUNC_ATTR_CONVERGENT);
> +   ac_build_intrinsic(ac, "llvm.amdgcn.s.barrier",
> +  ac->voidt, NULL, 0, AC_FUNC_ATTR_CONVERGENT);
>  }
>
>  static void emit_discard_if(struct ac_nir_context *ctx,
> @@ -4331,7 +4330,7 @@ static void visit_intrinsic(struct ac_nir_context *ctx,
> emit_membar(ctx->nctx, instr);
> break;
> case nir_intrinsic_barrier:
> -   emit_barrier(ctx->nctx);
> +   emit_barrier(>ac, ctx->stage);
> break;
> case nir_intrinsic_var_atomic_add:
> case nir_intrinsic_var_atomic_imin:
> @@ -6169,7 +6168,7 @@ write_tess_factors(struct nir_to_llvm_context *ctx)
> LLVMValueRef lds_base, lds_inner, lds_outer, byteoffset, buffer;
> LLVMValueRef out[6], vec0, vec1, tf_base, inner[4], outer[4];
> int i;
> -   emit_barrier(ctx);
> +   emit_barrier(>ac, ctx->stage);
>
> switch (ctx->options->key.tcs.primitive_mode) {
> case GL_ISOLINES:
> @@ -6712,7 +6711,7 @@ LLVMModuleRef 
> ac_translate_nir_to_llvm(LLVMTargetMachineRef tm,
> }
>
> if (i)
> -   emit_barrier();
> +   emit_barrier(, ctx.stage);
>
> ac_setup_rings();
>
> --
> 2.14.3
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] ac: rework emit_barrier() to not segfault on radeonsi

2018-01-07 Thread Timothy Arceri
nir_to_llvm_context will always be NULL for radeonsi so we need
work around this.
---
 src/amd/common/ac_nir_to_llvm.c | 17 -
 1 file changed, 8 insertions(+), 9 deletions(-)

diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index 5203b78537..a729fe5f6d 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -3826,19 +3826,18 @@ static void emit_membar(struct nir_to_llvm_context *ctx,
ac_build_waitcnt(>ac, waitcnt);
 }
 
-static void emit_barrier(struct nir_to_llvm_context *ctx)
+static void emit_barrier(struct ac_llvm_context *ac, gl_shader_stage stage)
 {
/* SI only (thanks to a hw bug workaround):
 * The real barrier instruction isn’t needed, because an entire patch
 * always fits into a single wave.
 */
-   if (ctx->options->chip_class == SI &&
-   ctx->stage == MESA_SHADER_TESS_CTRL) {
-   ac_build_waitcnt(>ac, LGKM_CNT & VM_CNT);
+   if (ac->chip_class == SI && stage == MESA_SHADER_TESS_CTRL) {
+   ac_build_waitcnt(ac, LGKM_CNT & VM_CNT);
return;
}
-   ac_build_intrinsic(>ac, "llvm.amdgcn.s.barrier",
-  ctx->ac.voidt, NULL, 0, AC_FUNC_ATTR_CONVERGENT);
+   ac_build_intrinsic(ac, "llvm.amdgcn.s.barrier",
+  ac->voidt, NULL, 0, AC_FUNC_ATTR_CONVERGENT);
 }
 
 static void emit_discard_if(struct ac_nir_context *ctx,
@@ -4331,7 +4330,7 @@ static void visit_intrinsic(struct ac_nir_context *ctx,
emit_membar(ctx->nctx, instr);
break;
case nir_intrinsic_barrier:
-   emit_barrier(ctx->nctx);
+   emit_barrier(>ac, ctx->stage);
break;
case nir_intrinsic_var_atomic_add:
case nir_intrinsic_var_atomic_imin:
@@ -6169,7 +6168,7 @@ write_tess_factors(struct nir_to_llvm_context *ctx)
LLVMValueRef lds_base, lds_inner, lds_outer, byteoffset, buffer;
LLVMValueRef out[6], vec0, vec1, tf_base, inner[4], outer[4];
int i;
-   emit_barrier(ctx);
+   emit_barrier(>ac, ctx->stage);
 
switch (ctx->options->key.tcs.primitive_mode) {
case GL_ISOLINES:
@@ -6712,7 +6711,7 @@ LLVMModuleRef 
ac_translate_nir_to_llvm(LLVMTargetMachineRef tm,
}
 
if (i)
-   emit_barrier();
+   emit_barrier(, ctx.stage);
 
ac_setup_rings();
 
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] genxml: Add missing INSTDONE_1 bits on Gen7.5+.

2018-01-07 Thread Kenneth Graunke
This will make aubinator_error_decode decode them properly.
---
 src/intel/genxml/gen10.xml | 2 ++
 src/intel/genxml/gen75.xml | 2 ++
 src/intel/genxml/gen8.xml  | 2 ++
 src/intel/genxml/gen9.xml  | 2 ++
 4 files changed, 8 insertions(+)

diff --git a/src/intel/genxml/gen10.xml b/src/intel/genxml/gen10.xml
index a6b8f48fda5..47c679a3fa9 100644
--- a/src/intel/genxml/gen10.xml
+++ b/src/intel/genxml/gen10.xml
@@ -3637,6 +3637,8 @@
 
 
 
+
+
 
 
   
diff --git a/src/intel/genxml/gen75.xml b/src/intel/genxml/gen75.xml
index e2fd856197d..be537aff0ae 100644
--- a/src/intel/genxml/gen75.xml
+++ b/src/intel/genxml/gen75.xml
@@ -3046,6 +3046,8 @@
 
 
 
+
+
 
 
   
diff --git a/src/intel/genxml/gen8.xml b/src/intel/genxml/gen8.xml
index a89283ded6b..c075eecc34a 100644
--- a/src/intel/genxml/gen8.xml
+++ b/src/intel/genxml/gen8.xml
@@ -3300,6 +3300,8 @@
 
 
 
+
+
 
 
   
diff --git a/src/intel/genxml/gen9.xml b/src/intel/genxml/gen9.xml
index 1422463693d..2533ae8629f 100644
--- a/src/intel/genxml/gen9.xml
+++ b/src/intel/genxml/gen9.xml
@@ -3583,6 +3583,8 @@
 
 
 
+
+
 
 
   
-- 
2.15.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] ac: add load_tess_level() to the abi

2018-01-07 Thread Timothy Arceri
Fixes the following piglit tests in radeonsi:

vs-tcs-tes-tessinner-tessouter-inputs-quads.shader_test
vs-tcs-tes-tessinner-tessouter-inputs-tris.shader_test
vs-tes-tessinner-tessouter-inputs-quads.shader_test
vs-tes-tessinner-tessouter-inputs-tris.shader_test

v2: make use of si_shader_io_get_unique_index_patch()
via the helper in the previous patch rather than
shader_io_get_unique_index()

Reviewed-by: Nicolai Hähnle  (v1)
---
 src/amd/common/ac_nir_to_llvm.c  |  6 ++
 src/amd/common/ac_shader_abi.h   |  4 
 src/gallium/drivers/radeonsi/si_shader.c | 22 ++
 3 files changed, 32 insertions(+)

diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index 48e2920a15..5203b78537 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -4364,6 +4364,12 @@ static void visit_intrinsic(struct ac_nir_context *ctx,
result = ctx->abi->load_tess_coord(ctx->abi, type, 
instr->num_components);
break;
}
+   case nir_intrinsic_load_tess_level_outer:
+   result = ctx->abi->load_tess_level(ctx->abi, 
VARYING_SLOT_TESS_LEVEL_OUTER);
+   break;
+   case nir_intrinsic_load_tess_level_inner:
+   result = ctx->abi->load_tess_level(ctx->abi, 
VARYING_SLOT_TESS_LEVEL_INNER);
+   break;
case nir_intrinsic_load_patch_vertices_in:
result = LLVMConstInt(ctx->ac.i32, 
ctx->nctx->options->key.tcs.input_vertices, false);
break;
diff --git a/src/amd/common/ac_shader_abi.h b/src/amd/common/ac_shader_abi.h
index 277e4efe47..e3a47089a5 100644
--- a/src/amd/common/ac_shader_abi.h
+++ b/src/amd/common/ac_shader_abi.h
@@ -103,6 +103,10 @@ struct ac_shader_abi {
LLVMTypeRef type,
unsigned num_components);
 
+   LLVMValueRef (*load_tess_level)(struct ac_shader_abi *abi,
+   unsigned varying_id);
+
+
LLVMValueRef (*load_ubo)(struct ac_shader_abi *abi, LLVMValueRef index);
 
/**
diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index e579916359..86f3f7a8ba 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -1934,6 +1934,27 @@ static LLVMValueRef load_tess_level(struct 
si_shader_context *ctx,
 
 }
 
+static LLVMValueRef si_load_tess_level(struct ac_shader_abi *abi,
+  unsigned varying_id)
+{
+   struct si_shader_context *ctx = si_shader_context_from_abi(abi);
+   unsigned semantic_name;
+
+   switch (varying_id) {
+   case VARYING_SLOT_TESS_LEVEL_INNER:
+   semantic_name = TGSI_SEMANTIC_TESSINNER;
+   break;
+   case VARYING_SLOT_TESS_LEVEL_OUTER:
+   semantic_name = TGSI_SEMANTIC_TESSOUTER;
+   break;
+   default:
+   unreachable("unknown tess level");
+   }
+
+   return load_tess_level(ctx, semantic_name);
+
+}
+
 void si_load_system_value(struct si_shader_context *ctx,
  unsigned index,
  const struct tgsi_full_declaration *decl)
@@ -5971,6 +5992,7 @@ static bool si_compile_tgsi_main(struct si_shader_context 
*ctx,
bld_base->emit_fetch_funcs[TGSI_FILE_INPUT] = fetch_input_tes;
ctx->abi.load_tess_inputs = si_nir_load_input_tes;
ctx->abi.load_tess_coord = si_load_tess_coord;
+   ctx->abi.load_tess_level = si_load_tess_level;
if (shader->key.as_es)
ctx->abi.emit_outputs = si_llvm_emit_es_epilogue;
else
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] radeonsi: add load_tess_level() helper

2018-01-07 Thread Timothy Arceri
This will be shared by the tgsi and nir backends.

v2: move si_shader_io_get_unique_index_patch() call inside
the helper.

Reviewed-by: Nicolai Hähnle  (v1)
---
 src/gallium/drivers/radeonsi/si_shader.c | 33 ++--
 1 file changed, 19 insertions(+), 14 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index f6e3083e4c..e579916359 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -1916,6 +1916,24 @@ static LLVMValueRef si_load_tess_coord(struct 
ac_shader_abi *abi,
return lp_build_gather_values(>gallivm, coord, 4);
 }
 
+static LLVMValueRef load_tess_level(struct si_shader_context *ctx,
+   unsigned semantic_name)
+{
+   LLVMValueRef buffer, base, addr;
+
+   int param = si_shader_io_get_unique_index_patch(semantic_name, 0);
+
+   buffer = desc_from_addr_base64k(ctx, 
ctx->param_tcs_offchip_addr_base64k);
+
+   base = LLVMGetParam(ctx->main_fn, ctx->param_tcs_offchip_offset);
+   addr = get_tcs_tes_buffer_address(ctx, get_rel_patch_id(ctx), NULL,
+ LLVMConstInt(ctx->i32, param, 0));
+
+   return buffer_load(>bld_base, ctx->f32,
+  ~0, buffer, base, addr, true);
+
+}
+
 void si_load_system_value(struct si_shader_context *ctx,
  unsigned index,
  const struct tgsi_full_declaration *decl)
@@ -2034,21 +2052,8 @@ void si_load_system_value(struct si_shader_context *ctx,
 
case TGSI_SEMANTIC_TESSINNER:
case TGSI_SEMANTIC_TESSOUTER:
-   {
-   LLVMValueRef buffer, base, addr;
-   int param = 
si_shader_io_get_unique_index_patch(decl->Semantic.Name, 0);
-
-   buffer = desc_from_addr_base64k(ctx, 
ctx->param_tcs_offchip_addr_base64k);
-
-   base = LLVMGetParam(ctx->main_fn, 
ctx->param_tcs_offchip_offset);
-   addr = get_tcs_tes_buffer_address(ctx, get_rel_patch_id(ctx), 
NULL,
- LLVMConstInt(ctx->i32, param, 0));
-
-   value = buffer_load(>bld_base, ctx->f32,
-   ~0, buffer, base, addr, true);
-
+   value = load_tess_level(ctx, decl->Semantic.Name);
break;
-   }
 
case TGSI_SEMANTIC_DEFAULT_TESSOUTER_SI:
case TGSI_SEMANTIC_DEFAULT_TESSINNER_SI:
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] radv: Implement VK_ANDROID_native_buffer.

2018-01-07 Thread Dave Airlie
On 5 January 2018 at 03:38, Bas Nieuwenhuizen  wrote:
> Passes
>   dEQP-VK.api.smoke.*
>   dEQP-VK.wsi.android.*
>
> with android-cts-7.1_r12 .
>
> Unlike the initial anv implementation this does
> use syncobjs instead of waiting on the CPU.
>
> This is missing meson build coverage for now.
>
> One possible todo is that linux 4.15 now has a
> sycall that allows us to export amdgpu fence to
> a sync_file, which allows us not to force all
> fences and semaphores to use syncobjs. However,
> I had trouble with my kernel crashing regularly
> with NULL pointers, and I'm not sure how beneficial
> it is in the first place given that intel uses
> syncobjs for all fences if available.

Happy to have these merged as-is, but I do wonder if we could refactor out most
of radv_android.c into a wsi_android or some such to share with anv.

The current WSI already has some magic flags to pass things like the
no space for cmask flag.

For all 3:
Reviewed-by: Dave Airlie 

Dave.

> ---
>  src/amd/vulkan/Makefile.am  |   7 +
>  src/amd/vulkan/Makefile.sources |   3 +
>  src/amd/vulkan/meson.build  |   4 +-
>  src/amd/vulkan/radv_android.c   | 366 
> 
>  src/amd/vulkan/radv_device.c|   7 +-
>  src/amd/vulkan/radv_image.c |  12 ++
>  src/amd/vulkan/radv_private.h   |  12 ++
>  7 files changed, 407 insertions(+), 4 deletions(-)
>  create mode 100644 src/amd/vulkan/radv_android.c
>
> diff --git a/src/amd/vulkan/Makefile.am b/src/amd/vulkan/Makefile.am
> index e1a04e8c7f..a4e23cd28e 100644
> --- a/src/amd/vulkan/Makefile.am
> +++ b/src/amd/vulkan/Makefile.am
> @@ -99,6 +99,13 @@ VULKAN_LIB_DEPS += \
> $(WAYLAND_CLIENT_LIBS)
>  endif
>
> +if HAVE_PLATFORM_ANDROID
> +AM_CPPFLAGS += $(ANDROID_CPPFLAGS)
> +AM_CFLAGS += $(ANDROID_CFLAGS)
> +VULKAN_LIB_DEPS += $(ANDROID_LIBS)
> +VULKAN_SOURCES += $(VULKAN_ANDROID_FILES)
> +endif
> +
>  noinst_LTLIBRARIES = libvulkan_common.la
>  libvulkan_common_la_SOURCES = $(VULKAN_SOURCES)
>
> diff --git a/src/amd/vulkan/Makefile.sources b/src/amd/vulkan/Makefile.sources
> index c9d172c3b1..a510d88d96 100644
> --- a/src/amd/vulkan/Makefile.sources
> +++ b/src/amd/vulkan/Makefile.sources
> @@ -69,6 +69,9 @@ VULKAN_FILES := \
> vk_format.h \
> $(RADV_WS_AMDGPU_FILES)
>
> +VULKAN_ANDROID_FILES := \
> +   radv_android.c
> +
>  VULKAN_WSI_WAYLAND_FILES := \
> radv_wsi_wayland.c
>
> diff --git a/src/amd/vulkan/meson.build b/src/amd/vulkan/meson.build
> index 93997350a2..fa23d28fbb 100644
> --- a/src/amd/vulkan/meson.build
> +++ b/src/amd/vulkan/meson.build
> @@ -29,10 +29,10 @@ radv_entrypoints = custom_target(
>
>  radv_extensions_c = custom_target(
>'radv_extensions.c',
> -  input : ['radv_extensions.py', vk_api_xml],
> +  input : ['radv_extensions.py', vk_api_xml, vk_android_native_buffer_xml],
>output : ['radv_extensions.c'],
>command : [prog_python2, '@INPUT0@', '--xml', '@INPUT1@',
> - '--out', '@OUTPUT@'],
> + '--xml', '@INPUT2@', '--out', '@OUTPUT@'],
>  )
>
>  vk_format_table_c = custom_target(
> diff --git a/src/amd/vulkan/radv_android.c b/src/amd/vulkan/radv_android.c
> new file mode 100644
> index 00..09da601dac
> --- /dev/null
> +++ b/src/amd/vulkan/radv_android.c
> @@ -0,0 +1,366 @@
> +/*
> + * Copyright © 2017, Google Inc.
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the next
> + * paragraph) shall be included in all copies or substantial portions of the
> + * Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER 
> DEALINGS
> + * IN THE SOFTWARE.
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#include "radv_private.h"
> +
> +static int radv_hal_open(const struct hw_module_t* mod, const char* id, 
> struct hw_device_t** dev);
> +static int radv_hal_close(struct hw_device_t *dev);
> +
> +static void UNUSED
> +static_asserts(void)
> +{
> +   STATIC_ASSERT(HWVULKAN_DISPATCH_MAGIC == ICD_LOADER_MAGIC);
> +}
> +
> +PUBLIC struct hwvulkan_module_t 

Re: [Mesa-dev] [PATCH] glsl: Respect std430 layout in lower_buffer_access

2018-01-07 Thread Timothy Arceri

Ccing stable.

I've pushed this patch and the piglit test. Thanks for the patches :)

On 06/01/18 01:33, Florian Will wrote:

Respect the std430 rules for determining offset and size of struct
members when using a std430 buffer. std140 rules lead to wrong buffer
offsets in that case.

Fixes my test case attached in Bugzilla. No piglit changes.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104492
---
  src/compiler/glsl/lower_buffer_access.cpp | 14 ++
  1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/src/compiler/glsl/lower_buffer_access.cpp 
b/src/compiler/glsl/lower_buffer_access.cpp
index db6e8e367b..ff6f9c1fcf 100644
--- a/src/compiler/glsl/lower_buffer_access.cpp
+++ b/src/compiler/glsl/lower_buffer_access.cpp
@@ -73,16 +73,22 @@ lower_buffer_access::emit_access(void *mem_ctx,
  new(mem_ctx) ir_dereference_record(deref->clone(mem_ctx, NULL),
 field->name);
  
- field_offset =

-glsl_align(field_offset,
-   field->type->std140_base_alignment(row_major));
+ unsigned field_align;
+ if (packing == GLSL_INTERFACE_PACKING_STD430)
+field_align = field->type->std430_base_alignment(row_major);
+ else
+field_align = field->type->std140_base_alignment(row_major);
+ field_offset = glsl_align(field_offset, field_align);
  
   emit_access(mem_ctx, is_write, field_deref, base_offset,

   deref_offset + field_offset,
   row_major, NULL, packing,
   writemask_for_size(field_deref->type->vector_elements));
  
- field_offset += field->type->std140_size(row_major);

+ if (packing == GLSL_INTERFACE_PACKING_STD430)
+field_offset += field->type->std430_size(row_major);
+ else
+field_offset += field->type->std140_size(row_major);
}
return;
 }


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3 00/30] Nir support for Nouveau

2018-01-07 Thread Ilia Mirkin
On Sun, Jan 7, 2018 at 7:24 PM, Connor Abbott  wrote:
> On Sun, Jan 7, 2018 at 6:50 PM, Ilia Mirkin  wrote:
>> On Sun, Jan 7, 2018 at 6:39 PM, Connor Abbott  wrote:
>>> On Sun, Jan 7, 2018 at 3:42 PM, Karol Herbst  wrote:
 significant changes to last series:
 * disable support for 64 bit types
 * fix tessellation shader bugs
 * assume vec4 elements for variable index arrays (MemoryOpts workaround)

 piglit run -x glx -x egl -x streaming-texture-leak -x max-texture-size 
 tests/gpu.py:
 [26010/26010] skip: 10410, pass: 15386, warn: 9, fail: 191, crash: 14

 remaining issues:
 * transform feedback with geometry shaders
 * indirects in image_load/store
 * interpolateAt
 * getting 64 bit types to work. This is mainly limited by codegen RA being
   not able to handle those correctly, because from_TGSI just generates 
 merge
   and splits and doesn't hit the faulty paths.
>>>
>>> Just curious... what's the issue with register allocation?
>>
>> There are probably a few, but at least one is that it wants both sides
>> of a phi to be compounds (or neither). So if you have e.g.
>>
>> x int64;
>> if (foo) {
>>   x = a + b;
>> } else {
>>   x = (c, d) [i.e. high word is one, low word is the other]
>> }
>> use(x)
>>
>> Then you may end up with a 64-bit phi, with one of the arguments that
>> is a compound (the merge), and one of which isn't. The RA doesn't
>> handle that situation.
>>
>> I believe currently we have enough splits/merges that everything
>> that's 64-bit ends up being marked as a compound.
>>
>> I wonder if we can "just do that" for 64-bit values - i.e. auto-mark
>> them as compound. It all requires carefully figuring out how RA works
>> with all of this.
>
> How are 64 bit values handled on nvidia? Can you join any 2 32-bit
> registers to get a 64-bit one? Or do they have to be contiguous to use
> the 64-bit ops? Or are there separate 64-bit registers entirely? Or
> something else entirely?

Adjacent register pairs, starting with an even regsister (in all but
the oddest of cases... like the 64-bit SHL inputs). Also 128-bit
groups can end up being used (for "wide" loads/stores as well as
texture arguments). And on nv50, 16-bit values are a thing (in fact
every reg is "half" addressable, al/ah-style, but only a tiny handful
of operations can make use of that, notably integer mul, for which
there is no 32-bit variant).

>
> It sounds like the current approach is doing something wrong. The

Oh, yeah, it's totally a bug in the RA. My observation was that it's
easy to work around, and difficult to fix RA.

> example you mentioned shouldn't be too hard to handle. Typically, you
> have something like packUint2x32() and unpackUint2x32() in GLSL as
> built-in pseudoinstructions that get expanded after RA, and then the
> output of packUint2x32 is just a regular ol' 64-bit value, and the
> register allocator shouldn't care one bit about where that value came
> from - it should handle both sides of your if statement pretty much
> identically. Basically, it's a funky kind of move. If you want to
> coalesce the packUint2x32 before RA in your example, then things get a
> little trickier since 32-bit registers might interfere with a
> subregister of x. But that can come later.

In case you're curious,

https://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp#n1367

That's the assert being hit. Basically the idea is that you have
LValue's, and when they're coalesced (as they are when you have a
phi), one copies the defs from one list to another. This assert is
complaining about some defs being "compound" LValue's, and others not.
I haven't the faintest clue what "compound" means, unfortunately, or
why this check is there, or what the precise handling of compounds is.
Fixing this particular issue will require someone to dig into it.

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 6/7] radeonsi: pass input_idx to declare_nir_input_vs()

2018-01-07 Thread Timothy Arceri
This make it consistent with declare_nir_input_fs() and will allow
us to support doubles.
---
 src/gallium/drivers/radeonsi/si_shader_nir.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader_nir.c 
b/src/gallium/drivers/radeonsi/si_shader_nir.c
index fa61f42830..d05eb36d80 100644
--- a/src/gallium/drivers/radeonsi/si_shader_nir.c
+++ b/src/gallium/drivers/radeonsi/si_shader_nir.c
@@ -520,9 +520,10 @@ si_lower_nir(struct si_shader_selector* sel)
 
 static void declare_nir_input_vs(struct si_shader_context *ctx,
 struct nir_variable *variable,
+unsigned input_index,
 LLVMValueRef out[4])
 {
-   si_llvm_load_input_vs(ctx, variable->data.driver_location / 4, out);
+   si_llvm_load_input_vs(ctx, input_index, out);
 }
 
 static void declare_nir_input_fs(struct si_shader_context *ctx,
@@ -644,7 +645,7 @@ bool si_nir_build_llvm(struct si_shader_context *ctx, 
struct nir_shader *nir)
continue;
 
if (nir->info.stage == MESA_SHADER_VERTEX) {
-   declare_nir_input_vs(ctx, variable, data);
+   declare_nir_input_vs(ctx, variable, input_idx / 
4, data);
bitcast_inputs(ctx, data, input_idx);
} else if (nir->info.stage == MESA_SHADER_FRAGMENT) {
declare_nir_input_fs(ctx, variable, input_idx / 
4, data);
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 7/7] radeonsi/nir: add support vs double inputs

2018-01-07 Thread Timothy Arceri
---
 src/gallium/drivers/radeonsi/si_shader_nir.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/src/gallium/drivers/radeonsi/si_shader_nir.c 
b/src/gallium/drivers/radeonsi/si_shader_nir.c
index d05eb36d80..2218b7ad81 100644
--- a/src/gallium/drivers/radeonsi/si_shader_nir.c
+++ b/src/gallium/drivers/radeonsi/si_shader_nir.c
@@ -647,6 +647,11 @@ bool si_nir_build_llvm(struct si_shader_context *ctx, 
struct nir_shader *nir)
if (nir->info.stage == MESA_SHADER_VERTEX) {
declare_nir_input_vs(ctx, variable, input_idx / 
4, data);
bitcast_inputs(ctx, data, input_idx);
+   if (glsl_type_is_dual_slot(variable->type)) {
+   input_idx += 4;
+   declare_nir_input_vs(ctx, variable, 
input_idx / 4, data);
+   bitcast_inputs(ctx, data, input_idx);
+   }
} else if (nir->info.stage == MESA_SHADER_FRAGMENT) {
declare_nir_input_fs(ctx, variable, input_idx / 
4, data);
bitcast_inputs(ctx, data, input_idx);
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 5/7] radeonsi: add bitcast_inputs() helper

2018-01-07 Thread Timothy Arceri
Will be used in a following patch to help support doubles.
---
 src/gallium/drivers/radeonsi/si_shader_nir.c | 21 +++--
 1 file changed, 15 insertions(+), 6 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader_nir.c 
b/src/gallium/drivers/radeonsi/si_shader_nir.c
index 2d0e3e207e..fa61f42830 100644
--- a/src/gallium/drivers/radeonsi/si_shader_nir.c
+++ b/src/gallium/drivers/radeonsi/si_shader_nir.c
@@ -610,6 +610,16 @@ si_nir_load_sampler_desc(struct ac_shader_abi *abi,
return si_load_sampler_desc(ctx, list, index, desc_type);
 }
 
+static void bitcast_inputs(struct si_shader_context *ctx,
+  LLVMValueRef data[4],
+  unsigned input_idx)
+{
+   for (unsigned chan = 0; chan < 4; chan++) {
+   ctx->inputs[input_idx + chan] =
+   LLVMBuildBitCast(ctx->ac.builder, data[chan], 
ctx->ac.i32, "");
+   }
+}
+
 bool si_nir_build_llvm(struct si_shader_context *ctx, struct nir_shader *nir)
 {
struct tgsi_shader_info *info = >shader->selector->info;
@@ -633,15 +643,14 @@ bool si_nir_build_llvm(struct si_shader_context *ctx, 
struct nir_shader *nir)
if (processed_inputs & ((uint64_t)1 << loc))
continue;
 
-   if (nir->info.stage == MESA_SHADER_VERTEX)
+   if (nir->info.stage == MESA_SHADER_VERTEX) {
declare_nir_input_vs(ctx, variable, data);
-   else if (nir->info.stage == MESA_SHADER_FRAGMENT)
+   bitcast_inputs(ctx, data, input_idx);
+   } else if (nir->info.stage == MESA_SHADER_FRAGMENT) {
declare_nir_input_fs(ctx, variable, input_idx / 
4, data);
-
-   for (unsigned chan = 0; chan < 4; chan++) {
-   ctx->inputs[input_idx + chan] =
-   LLVMBuildBitCast(ctx->ac.builder, 
data[chan], ctx->ac.i32, "");
+   bitcast_inputs(ctx, data, input_idx);
}
+
processed_inputs |= ((uint64_t)1 << loc);
}
}
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/7] nir: add vs_inputs_dual_locations compiler option

2018-01-07 Thread Timothy Arceri
Allows nir drivers to either use a single or dual locations for
vs double inputs.

i965 uses dual locations for both OpenGL and Vulkan drivers, for
now gallium OpenGL drivers only use a single location.

The following patch will also make use of this option when
calling nir_shader_gather_info().
---
 src/amd/vulkan/radv_shader.c  |  1 +
 src/compiler/glsl/glsl_to_nir.cpp | 14 +-
 src/compiler/nir/nir.h|  6 ++
 src/intel/compiler/brw_compiler.c |  3 +++
 4 files changed, 19 insertions(+), 5 deletions(-)

diff --git a/src/amd/vulkan/radv_shader.c b/src/amd/vulkan/radv_shader.c
index 31879805ae..18e3a3f144 100644
--- a/src/amd/vulkan/radv_shader.c
+++ b/src/amd/vulkan/radv_shader.c
@@ -66,6 +66,7 @@ static const struct nir_shader_compiler_options nir_options = 
{
.lower_extract_byte = true,
.lower_extract_word = true,
.lower_ffma = true,
+   .vs_inputs_dual_locations = true,
.max_unroll_iterations = 32
 };
 
diff --git a/src/compiler/glsl/glsl_to_nir.cpp 
b/src/compiler/glsl/glsl_to_nir.cpp
index 505c99bbe3..4e3e9c4610 100644
--- a/src/compiler/glsl/glsl_to_nir.cpp
+++ b/src/compiler/glsl/glsl_to_nir.cpp
@@ -130,11 +130,15 @@ private:
 } /* end of anonymous namespace */
 
 static void
-nir_remap_attributes(nir_shader *shader)
+nir_remap_attributes(nir_shader *shader,
+ const nir_shader_compiler_options *options)
 {
-   nir_foreach_variable(var, >inputs) {
-  var->data.location += _mesa_bitcount_64(shader->info.vs.double_inputs &
-  
BITFIELD64_MASK(var->data.location));
+   if (options->vs_inputs_dual_locations) {
+  nir_foreach_variable(var, >inputs) {
+ var->data.location +=
+_mesa_bitcount_64(shader->info.vs.double_inputs &
+  BITFIELD64_MASK(var->data.location));
+  }
}
 
/* Once the remap is done, reset double_inputs_read, so later it will have
@@ -164,7 +168,7 @@ glsl_to_nir(const struct gl_shader_program *shader_prog,
 * location 0 and vec4 attr1 in location 1, in NIR attr0 will use
 * locations/slots 0 and 1, and attr1 will use location/slot 2 */
if (shader->info.stage == MESA_SHADER_VERTEX)
-  nir_remap_attributes(shader);
+  nir_remap_attributes(shader, options);
 
shader->info.name = ralloc_asprintf(shader, "GLSL%d", shader_prog->Name);
if (shader_prog->Label)
diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
index 440c3fe997..4eb9c80ecd 100644
--- a/src/compiler/nir/nir.h
+++ b/src/compiler/nir/nir.h
@@ -1892,6 +1892,12 @@ typedef struct nir_shader_compiler_options {
 */
bool use_interpolated_input_intrinsics;
 
+   /**
+* Do vertex shader double inputs use two locations? The Vulkan spec
+* requires two locations to be used, OpenGL allows a single location.
+*/
+   bool vs_inputs_dual_locations;
+
unsigned max_unroll_iterations;
 } nir_shader_compiler_options;
 
diff --git a/src/intel/compiler/brw_compiler.c 
b/src/intel/compiler/brw_compiler.c
index e89aeacc7d..e515559acb 100644
--- a/src/intel/compiler/brw_compiler.c
+++ b/src/intel/compiler/brw_compiler.c
@@ -57,6 +57,7 @@ static const struct nir_shader_compiler_options 
scalar_nir_options = {
.lower_unpack_snorm_4x8 = true,
.lower_unpack_unorm_2x16 = true,
.lower_unpack_unorm_4x8 = true,
+   .vs_inputs_dual_locations = true,
.max_unroll_iterations = 32,
 };
 
@@ -78,6 +79,7 @@ static const struct nir_shader_compiler_options 
vector_nir_options = {
.lower_unpack_unorm_2x16 = true,
.lower_extract_byte = true,
.lower_extract_word = true,
+   .vs_inputs_dual_locations = true,
.max_unroll_iterations = 32,
 };
 
@@ -96,6 +98,7 @@ static const struct nir_shader_compiler_options 
vector_nir_options_gen6 = {
.lower_unpack_unorm_2x16 = true,
.lower_extract_byte = true,
.lower_extract_word = true,
+   .vs_inputs_dual_locations = true,
.max_unroll_iterations = 32,
 };
 
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] radeonsi nir arb_vertex_attrib_64bit fixes

2018-01-07 Thread Timothy Arceri
This series fixes all of the failing arb_vertex_attrib_64bit
piglit tests ~1000.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/7] nir: partially revert c2acf97fcc9b32e

2018-01-07 Thread Timothy Arceri
c2acf97fcc9b32e changed the use of double_inputs_read to be
inconsitent with its previous meaning. Here we re-enable the
gather info code that was removed as the modified code from
c2acf97fcc9b32e now uses the double_inputs member rather than
double_inputs_read.

This change allows us to use double_inputs_read with gallium
drivers without impacting double_inputs which is used by i965.

We also make use of the compiler option vs_inputs_dual_locations
to allow for the difference in behaviour between drivers that handle
vs inputs as taking up two locations for doubles, versus those that
treat them as taking a single location.
---
 src/compiler/nir/nir_gather_info.c | 29 +++--
 1 file changed, 23 insertions(+), 6 deletions(-)

diff --git a/src/compiler/nir/nir_gather_info.c 
b/src/compiler/nir/nir_gather_info.c
index e98129b22c..743f968035 100644
--- a/src/compiler/nir/nir_gather_info.c
+++ b/src/compiler/nir/nir_gather_info.c
@@ -54,6 +54,11 @@ set_io_mask(nir_shader *shader, nir_variable *var, int 
offset, int len,
  else
 shader->info.inputs_read |= bitfield;
 
+ /* double inputs read is only for vertex inputs */
+ if (shader->info.stage == MESA_SHADER_VERTEX &&
+ glsl_type_is_dual_slot(glsl_without_array(var->type)))
+shader->info.vs.double_inputs_read |= bitfield;
+
  if (shader->info.stage == MESA_SHADER_FRAGMENT) {
 shader->info.fs.uses_sample_qualifier |= var->data.sample;
  }
@@ -88,21 +93,27 @@ static void
 mark_whole_variable(nir_shader *shader, nir_variable *var, bool is_output_read)
 {
const struct glsl_type *type = var->type;
+   bool is_vertex_input = false;
 
if (nir_is_per_vertex_io(var, shader->info.stage)) {
   assert(glsl_type_is_array(type));
   type = glsl_get_array_element(type);
}
 
+   if (!shader->options->vs_inputs_dual_locations &&
+   shader->info.stage == MESA_SHADER_VERTEX &&
+   var->data.mode == nir_var_shader_in)
+  is_vertex_input = true;
+
const unsigned slots =
   var->data.compact ? DIV_ROUND_UP(glsl_get_length(type), 4)
-: glsl_count_attribute_slots(type, false);
+: glsl_count_attribute_slots(type, is_vertex_input);
 
set_io_mask(shader, var, 0, slots, is_output_read);
 }
 
 static unsigned
-get_io_offset(nir_deref_var *deref)
+get_io_offset(nir_deref_var *deref, bool is_vertex_input)
 {
unsigned offset = 0;
 
@@ -117,7 +128,7 @@ get_io_offset(nir_deref_var *deref)
 return -1;
  }
 
- offset += glsl_count_attribute_slots(tail->type, false) *
+ offset += glsl_count_attribute_slots(tail->type, is_vertex_input) *
 deref_array->base_offset;
   }
   /* TODO: we can get the offset for structs here see nir_lower_io() */
@@ -163,7 +174,13 @@ try_mask_partial_io(nir_shader *shader, nir_deref_var 
*deref, bool is_output_rea
   return false;
}
 
-   unsigned offset = get_io_offset(deref);
+   bool is_vertex_input = false;
+   if (!shader->options->vs_inputs_dual_locations &&
+   shader->info.stage == MESA_SHADER_VERTEX &&
+   var->data.mode == nir_var_shader_in)
+  is_vertex_input = true;
+
+   unsigned offset = get_io_offset(deref, is_vertex_input);
if (offset == -1)
   return false;
 
@@ -179,7 +196,8 @@ try_mask_partial_io(nir_shader *shader, nir_deref_var 
*deref, bool is_output_rea
}
 
/* double element width for double types that takes two slots */
-   if (glsl_type_is_dual_slot(glsl_without_array(type))) {
+   if (!is_vertex_input &&
+   glsl_type_is_dual_slot(glsl_without_array(type))) {
   elem_width *= 2;
}
 
@@ -235,7 +253,6 @@ gather_intrinsic_info(nir_intrinsic_instr *instr, 
nir_shader *shader)
 for (uint i = 0; i < glsl_count_attribute_slots(var->type, false); 
i++) {
int idx = var->data.location + i;
shader->info.vs.double_inputs |= BITFIELD64_BIT(idx);
-   shader->info.vs.double_inputs_read |= BITFIELD64_BIT(idx);
 }
  }
   }
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/7] compiler: tidy up double_inputs_read uses

2018-01-07 Thread Timothy Arceri
First we move double_inputs_read into a vs struct in the union,
double_inputs_read is only used for vs inputs so this will
save space and also allows us to add a new double_inputs field.

We add the new field because c2acf97fcc9b changed the behaviour
of double_inputs_read, and while it's no longer used to track
actual reads in i965 we do still want to track this for gallium
drivers.
---
 src/compiler/glsl/glsl_to_nir.cpp   |  9 +
 src/compiler/glsl/ir_set_program_inouts.cpp |  2 +-
 src/compiler/nir/nir_gather_info.c  |  8 ++--
 src/compiler/shader_info.h  | 10 --
 src/intel/compiler/brw_vec4.cpp |  2 +-
 src/mesa/state_tracker/st_glsl_to_nir.cpp   |  2 +-
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp  |  2 +-
 src/mesa/state_tracker/st_program.c |  2 +-
 8 files changed, 24 insertions(+), 13 deletions(-)

diff --git a/src/compiler/glsl/glsl_to_nir.cpp 
b/src/compiler/glsl/glsl_to_nir.cpp
index 0493410aeb..505c99bbe3 100644
--- a/src/compiler/glsl/glsl_to_nir.cpp
+++ b/src/compiler/glsl/glsl_to_nir.cpp
@@ -133,13 +133,13 @@ static void
 nir_remap_attributes(nir_shader *shader)
 {
nir_foreach_variable(var, >inputs) {
-  var->data.location += _mesa_bitcount_64(shader->info.double_inputs_read &
+  var->data.location += _mesa_bitcount_64(shader->info.vs.double_inputs &
   
BITFIELD64_MASK(var->data.location));
}
 
/* Once the remap is done, reset double_inputs_read, so later it will have
 * which location/slots are doubles */
-   shader->info.double_inputs_read = 0;
+   shader->info.vs.double_inputs = 0;
 }
 
 nir_shader *
@@ -363,10 +363,11 @@ nir_visitor::visit(ir_variable *ir)
   }
 
   /* Mark all the locations that require two slots */
-  if (glsl_type_is_dual_slot(glsl_without_array(var->type))) {
+  if (shader->info.stage == MESA_SHADER_VERTEX &&
+  glsl_type_is_dual_slot(glsl_without_array(var->type))) {
  for (uint i = 0; i < glsl_count_attribute_slots(var->type, true); 
i++) {
 uint64_t bitfield = BITFIELD64_BIT(var->data.location + i);
-shader->info.double_inputs_read |= bitfield;
+shader->info.vs.double_inputs |= bitfield;
  }
   }
   break;
diff --git a/src/compiler/glsl/ir_set_program_inouts.cpp 
b/src/compiler/glsl/ir_set_program_inouts.cpp
index 90b06b9f41..1b6c8d750b 100644
--- a/src/compiler/glsl/ir_set_program_inouts.cpp
+++ b/src/compiler/glsl/ir_set_program_inouts.cpp
@@ -118,7 +118,7 @@ mark(struct gl_program *prog, ir_variable *var, int offset, 
int len,
  /* double inputs read is only for vertex inputs */
  if (stage == MESA_SHADER_VERTEX &&
  var->type->without_array()->is_dual_slot())
-prog->info.double_inputs_read |= bitfield;
+prog->info.vs.double_inputs_read |= bitfield;
 
  if (stage == MESA_SHADER_FRAGMENT) {
 prog->info.fs.uses_sample_qualifier |= var->data.sample;
diff --git a/src/compiler/nir/nir_gather_info.c 
b/src/compiler/nir/nir_gather_info.c
index 946939657e..e98129b22c 100644
--- a/src/compiler/nir/nir_gather_info.c
+++ b/src/compiler/nir/nir_gather_info.c
@@ -234,7 +234,8 @@ gather_intrinsic_info(nir_intrinsic_instr *instr, 
nir_shader *shader)
  glsl_type_is_dual_slot(glsl_without_array(var->type))) {
 for (uint i = 0; i < glsl_count_attribute_slots(var->type, false); 
i++) {
int idx = var->data.location + i;
-   shader->info.double_inputs_read |= BITFIELD64_BIT(idx);
+   shader->info.vs.double_inputs |= BITFIELD64_BIT(idx);
+   shader->info.vs.double_inputs_read |= BITFIELD64_BIT(idx);
 }
  }
   }
@@ -356,10 +357,13 @@ nir_shader_gather_info(nir_shader *shader, 
nir_function_impl *entrypoint)
shader->info.outputs_written = 0;
shader->info.outputs_read = 0;
shader->info.patch_outputs_read = 0;
-   shader->info.double_inputs_read = 0;
shader->info.patch_inputs_read = 0;
shader->info.patch_outputs_written = 0;
shader->info.system_values_read = 0;
+   if (shader->info.stage == MESA_SHADER_VERTEX) {
+  shader->info.vs.double_inputs = 0;
+  shader->info.vs.double_inputs_read = 0;
+   }
if (shader->info.stage == MESA_SHADER_FRAGMENT) {
   shader->info.fs.uses_sample_qualifier = false;
}
diff --git a/src/compiler/shader_info.h b/src/compiler/shader_info.h
index 4492cad0e8..f6dedb8d62 100644
--- a/src/compiler/shader_info.h
+++ b/src/compiler/shader_info.h
@@ -67,8 +67,6 @@ typedef struct shader_info {
 
/* Which inputs are actually read */
uint64_t inputs_read;
-   /* Which inputs are actually read and are double */
-   uint64_t double_inputs_read;
/* Which outputs are actually written */
uint64_t outputs_written;
/* Which outputs are actually read */
@@ -109,6 +107,14 @@ typedef struct shader_info {
bool 

[Mesa-dev] [PATCH 4/7] radeonsi/nir: fix num_inputs for doubles in vs

2018-01-07 Thread Timothy Arceri
---
 src/gallium/drivers/radeonsi/si_shader_nir.c | 13 -
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader_nir.c 
b/src/gallium/drivers/radeonsi/si_shader_nir.c
index d5b8f835b9..2d0e3e207e 100644
--- a/src/gallium/drivers/radeonsi/si_shader_nir.c
+++ b/src/gallium/drivers/radeonsi/si_shader_nir.c
@@ -206,8 +206,13 @@ void si_nir_scan_shader(const struct nir_shader *nir,
 * tracker has already mapped them to attributes via
 * variable->data.driver_location.
 */
-   if (nir->info.stage == MESA_SHADER_VERTEX)
+   if (nir->info.stage == MESA_SHADER_VERTEX) {
+   if (glsl_type_is_dual_slot(variable->type))
+   num_inputs += 2;
+   else
+   num_inputs++;
continue;
+   }
 
assert(nir->info.stage != MESA_SHADER_FRAGMENT ||
   (attrib_count == 1 && "not implemented"));
@@ -297,10 +302,8 @@ void si_nir_scan_shader(const struct nir_shader *nir,
info->colors_read |= 0xf0;
}
 
-   if (nir->info.stage != MESA_SHADER_VERTEX)
-   info->num_inputs = num_inputs;
-   else
-   info->num_inputs = nir->num_inputs;
+   info->num_inputs = num_inputs;
+
 
i = 0;
uint64_t processed_outputs = 0;
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3 00/30] Nir support for Nouveau

2018-01-07 Thread Connor Abbott
On Sun, Jan 7, 2018 at 6:50 PM, Ilia Mirkin  wrote:
> On Sun, Jan 7, 2018 at 6:39 PM, Connor Abbott  wrote:
>> On Sun, Jan 7, 2018 at 3:42 PM, Karol Herbst  wrote:
>>> significant changes to last series:
>>> * disable support for 64 bit types
>>> * fix tessellation shader bugs
>>> * assume vec4 elements for variable index arrays (MemoryOpts workaround)
>>>
>>> piglit run -x glx -x egl -x streaming-texture-leak -x max-texture-size 
>>> tests/gpu.py:
>>> [26010/26010] skip: 10410, pass: 15386, warn: 9, fail: 191, crash: 14
>>>
>>> remaining issues:
>>> * transform feedback with geometry shaders
>>> * indirects in image_load/store
>>> * interpolateAt
>>> * getting 64 bit types to work. This is mainly limited by codegen RA being
>>>   not able to handle those correctly, because from_TGSI just generates merge
>>>   and splits and doesn't hit the faulty paths.
>>
>> Just curious... what's the issue with register allocation?
>
> There are probably a few, but at least one is that it wants both sides
> of a phi to be compounds (or neither). So if you have e.g.
>
> x int64;
> if (foo) {
>   x = a + b;
> } else {
>   x = (c, d) [i.e. high word is one, low word is the other]
> }
> use(x)
>
> Then you may end up with a 64-bit phi, with one of the arguments that
> is a compound (the merge), and one of which isn't. The RA doesn't
> handle that situation.
>
> I believe currently we have enough splits/merges that everything
> that's 64-bit ends up being marked as a compound.
>
> I wonder if we can "just do that" for 64-bit values - i.e. auto-mark
> them as compound. It all requires carefully figuring out how RA works
> with all of this.

How are 64 bit values handled on nvidia? Can you join any 2 32-bit
registers to get a 64-bit one? Or do they have to be contiguous to use
the 64-bit ops? Or are there separate 64-bit registers entirely? Or
something else entirely?

It sounds like the current approach is doing something wrong. The
example you mentioned shouldn't be too hard to handle. Typically, you
have something like packUint2x32() and unpackUint2x32() in GLSL as
built-in pseudoinstructions that get expanded after RA, and then the
output of packUint2x32 is just a regular ol' 64-bit value, and the
register allocator shouldn't care one bit about where that value came
from - it should handle both sides of your if statement pretty much
identically. Basically, it's a funky kind of move. If you want to
coalesce the packUint2x32 before RA in your example, then things get a
little trickier since 32-bit registers might interfere with a
subregister of x. But that can come later.

>
>   -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 07/22] dri/common: Add option to allow exposure of 10 bpc color configs. (v2)

2018-01-07 Thread Steven Newbury
On Fri, 2017-12-15 at 23:04 +0100, Mario Kleiner wrote:
> Some clients may not like RGB10X2 and RGB10A2 fbconfigs and
> visuals. Add a new driconf option 'allow_rgb10_configs' to
> allow per application enable/disable.
> 
> The option defaults to enabled.
> 
> v2: Rename expose_rgb10_configs to allow_rgb10_configs,
> as suggested by Emil. Add comment to option parsing,
> to make sure it stays before the ->InitScreen().
> 
> Signed-off-by: Mario Kleiner 
> Reviewed-by: Tapani Pälli 
> Reviewed-by: Marek Olšák 
> ---
>  src/mesa/drivers/dri/common/dri_util.c | 12 
>  src/util/xmlpool/t_options.h   |  5 +
>  2 files changed, 13 insertions(+), 4 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/common/dri_util.c
> b/src/mesa/drivers/dri/common/dri_util.c
> index d504751..d4fba0b 100644
> --- a/src/mesa/drivers/dri/common/dri_util.c
> +++ b/src/mesa/drivers/dri/common/dri_util.c
> @@ -55,6 +55,10 @@ const char __dri2ConfigOptions[] =
>DRI_CONF_SECTION_PERFORMANCE
>   DRI_CONF_VBLANK_MODE(DRI_CONF_VBLANK_DEF_INTERVAL_1)
>DRI_CONF_SECTION_END
> +
> +  DRI_CONF_SECTION_MISCELLANEOUS
> + DRI_CONF_ALLOW_RGB10_CONFIGS("true")
> +  DRI_CONF_SECTION_END
> DRI_CONF_END;
>  
This isn't exposing the driconf option for me with IVB HD4000.  Adding
the option to the same section in 
src/mesa/drivers/dri/i965/intel_screen.c did work though.

Mind you having this default as true (which it does whether or not the
option is available) is really bad here.  On my LVDS display (which
presumably isn't supporting 10bpc even if the chipset does, means by
default my display colours are completely corrupted.

What's more, the driconf option doesn't address Wayland compositors,
for example GDM with Wayland doesn't respect the option but uses, I
guess, a default config which happens to be 10bpc.


signature.asc
Description: This is a digitally signed message part
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3 00/30] Nir support for Nouveau

2018-01-07 Thread Ilia Mirkin
On Sun, Jan 7, 2018 at 6:39 PM, Connor Abbott  wrote:
> On Sun, Jan 7, 2018 at 3:42 PM, Karol Herbst  wrote:
>> significant changes to last series:
>> * disable support for 64 bit types
>> * fix tessellation shader bugs
>> * assume vec4 elements for variable index arrays (MemoryOpts workaround)
>>
>> piglit run -x glx -x egl -x streaming-texture-leak -x max-texture-size 
>> tests/gpu.py:
>> [26010/26010] skip: 10410, pass: 15386, warn: 9, fail: 191, crash: 14
>>
>> remaining issues:
>> * transform feedback with geometry shaders
>> * indirects in image_load/store
>> * interpolateAt
>> * getting 64 bit types to work. This is mainly limited by codegen RA being
>>   not able to handle those correctly, because from_TGSI just generates merge
>>   and splits and doesn't hit the faulty paths.
>
> Just curious... what's the issue with register allocation?

There are probably a few, but at least one is that it wants both sides
of a phi to be compounds (or neither). So if you have e.g.

x int64;
if (foo) {
  x = a + b;
} else {
  x = (c, d) [i.e. high word is one, low word is the other]
}
use(x)

Then you may end up with a 64-bit phi, with one of the arguments that
is a compound (the merge), and one of which isn't. The RA doesn't
handle that situation.

I believe currently we have enough splits/merges that everything
that's 64-bit ends up being marked as a compound.

I wonder if we can "just do that" for 64-bit values - i.e. auto-mark
them as compound. It all requires carefully figuring out how RA works
with all of this.

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3 00/30] Nir support for Nouveau

2018-01-07 Thread Connor Abbott
On Sun, Jan 7, 2018 at 3:42 PM, Karol Herbst  wrote:
> significant changes to last series:
> * disable support for 64 bit types
> * fix tessellation shader bugs
> * assume vec4 elements for variable index arrays (MemoryOpts workaround)
>
> piglit run -x glx -x egl -x streaming-texture-leak -x max-texture-size 
> tests/gpu.py:
> [26010/26010] skip: 10410, pass: 15386, warn: 9, fail: 191, crash: 14
>
> remaining issues:
> * transform feedback with geometry shaders
> * indirects in image_load/store
> * interpolateAt
> * getting 64 bit types to work. This is mainly limited by codegen RA being
>   not able to handle those correctly, because from_TGSI just generates merge
>   and splits and doesn't hit the faulty paths.

Just curious... what's the issue with register allocation?

>
> Karol Herbst (30):
>   nir: fix st_nir_assign_var_locations for patch variables
>   nvir: print the shader type when dumping headers
>   nvir: move common converter code in base class
>   nvc0: add support for NIR
>   nvc0/debug: add env var to make nir default
>   nvir/nir: run some passes to make the conversion easier
>   nvir/nir: track defs and provide easy access functions
>   nvir/nir: add nir type helper functions
>   nvir/nir: run assignSlots
>   nvir/nir: parse NIR shader info
>   nvir/nir: implement CFG handling
>   nvir/nir: implement nir_load_const_instr
>   nvir/nir: add skeleton for nir_intrinsic_instr
>   nvir/nir: implement nir_alu_instr handling
>   nvir/nir: implement nir_intrinsic_load_uniform
>   nvir/nir: implement nir_intrinsic_store_(per_vertex_)output
>   nvir/nir: implement nir_intrinsic_load_input
>   nvir/nir: implement intrinsic_discard(_if)
>   nvir/nir: implement loading system values
>   nvir/nir: implement nir_ssa_undef_instr
>   nvir/nir: implement nir_instr_type_tex
>   nvir/nir: add getOperation for intrinsics
>   nvir/nir: implement vote and ballot
>   nvir/nir: implement variable indexing
>   nvir/nir: implement geometry shader nir_intrinsics
>   nvir/nir: implement nir_intrinsic_load_ubo
>   nvir/nir: implement ssbo intrinsics
>   nvir/nir: implement images
>   nvir/nir: add memory barriers
>   nvir/nir: implement load_per_vertex_output
>
>  src/gallium/drivers/nouveau/Makefile.sources   |3 +
>  src/gallium/drivers/nouveau/codegen/nv50_ir.cpp|3 +
>  src/gallium/drivers/nouveau/codegen/nv50_ir.h  |1 +
>  .../nouveau/codegen/nv50_ir_from_common.cpp|  107 +
>  .../drivers/nouveau/codegen/nv50_ir_from_common.h  |   58 +
>  .../drivers/nouveau/codegen/nv50_ir_from_nir.cpp   | 2716 
> 
>  .../drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp  |  106 +-
>  src/gallium/drivers/nouveau/meson.build|   12 +-
>  src/gallium/drivers/nouveau/nouveau_screen.c   |4 +
>  src/gallium/drivers/nouveau/nouveau_screen.h   |2 +
>  src/gallium/drivers/nouveau/nvc0/nvc0_program.c|   19 +-
>  src/gallium/drivers/nouveau/nvc0/nvc0_screen.c |   57 +-
>  src/gallium/drivers/nouveau/nvc0/nvc0_state.c  |   27 +-
>  src/mesa/state_tracker/st_glsl_to_nir.cpp  |8 +-
>  14 files changed, 3003 insertions(+), 120 deletions(-)
>  create mode 100644 
> src/gallium/drivers/nouveau/codegen/nv50_ir_from_common.cpp
>  create mode 100644 src/gallium/drivers/nouveau/codegen/nv50_ir_from_common.h
>  create mode 100644 src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
>
> --
> 2.14.3
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 103538] vkDestroySwapchain causes deadlock on Wayland compositor with X11

2018-01-07 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=103538

Jan Vlug  changed:

   What|Removed |Added

 CC||jan.pub...@famvlug.nl

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3 00/30] Nir support for Nouveau

2018-01-07 Thread Karol Herbst
On Sun, Jan 7, 2018 at 9:51 PM, Ilia Mirkin  wrote:
> On Sun, Jan 7, 2018 at 3:42 PM, Karol Herbst  wrote:
>> significant changes to last series:
>> * disable support for 64 bit types
>> * fix tessellation shader bugs
>> * assume vec4 elements for variable index arrays (MemoryOpts workaround)
>>
>> piglit run -x glx -x egl -x streaming-texture-leak -x max-texture-size 
>> tests/gpu.py:
>> [26010/26010] skip: 10410, pass: 15386, warn: 9, fail: 191, crash: 14
>
> Great work!
>
>>
>> remaining issues:
>> * transform feedback with geometry shaders
>
> This could be some very important issue. I'd highly recommend looking
> into it in some depth.
>

I think in the end it is a gallium issue related to NIR, TF and
geometry shaders.

most of the transform feedback related tetsts are passing (around
95%), so I really don't worry too much about those. Still want to look
at it, but I don't see it as critical right now.

>> * indirects in image_load/store
>> * interpolateAt
>
> I'd ignore those. It's unlikely to be anything too structural.
>
>> * getting 64 bit types to work. This is mainly limited by codegen RA being
>>   not able to handle those correctly, because from_TGSI just generates merge
>>   and splits and doesn't hit the faulty paths.
>
> Just generate the merges and splits like I told you initially. Store
> 64-bit vars into a merge of 2 original Value's.
>

yeah, just need to come up with a nice way of doing that :)

> I haven't looked at the patches, but have you fixed up the pipe caps
> to disable the various features/etc that you don't implement when
> using nir?
>
>   -ilia

well I disabled PIPE_CAP_DOUBLES, PIPE_CAP_INT64 and
PIPE_CAP_TEXTURE_GATHER_OFFSETS when using NIR.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3 00/30] Nir support for Nouveau

2018-01-07 Thread Ilia Mirkin
On Sun, Jan 7, 2018 at 3:42 PM, Karol Herbst  wrote:
> significant changes to last series:
> * disable support for 64 bit types
> * fix tessellation shader bugs
> * assume vec4 elements for variable index arrays (MemoryOpts workaround)
>
> piglit run -x glx -x egl -x streaming-texture-leak -x max-texture-size 
> tests/gpu.py:
> [26010/26010] skip: 10410, pass: 15386, warn: 9, fail: 191, crash: 14

Great work!

>
> remaining issues:
> * transform feedback with geometry shaders

This could be some very important issue. I'd highly recommend looking
into it in some depth.

> * indirects in image_load/store
> * interpolateAt

I'd ignore those. It's unlikely to be anything too structural.

> * getting 64 bit types to work. This is mainly limited by codegen RA being
>   not able to handle those correctly, because from_TGSI just generates merge
>   and splits and doesn't hit the faulty paths.

Just generate the merges and splits like I told you initially. Store
64-bit vars into a merge of 2 original Value's.

I haven't looked at the patches, but have you fixed up the pipe caps
to disable the various features/etc that you don't implement when
using nir?

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 30/30] nvir/nir: implement load_per_vertex_output

2018-01-07 Thread Karol Herbst
Signed-off-by: Karol Herbst 
---
 .../drivers/nouveau/codegen/nv50_ir_from_nir.cpp   | 30 ++
 1 file changed, 30 insertions(+)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
index 08490a39e9..c40fdae1a1 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
@@ -1768,6 +1768,36 @@ Converter::visit(nir_intrinsic_instr *insn)
   }
   break;
}
+   case nir_intrinsic_load_per_vertex_output: {
+  const DataType dType = getDType(insn);
+  LValues  = convert(>dest);
+  Value *indirectVertex;
+  Value *indirectOffset;
+  auto baseVertex = getIndirect(>src[0], 0, );
+  auto baseOffset = getIndirect(>src[1], 0, );
+
+  auto idx = nir_intrinsic_base(insn) + baseOffset;
+  Value *vtxBase = nullptr;
+
+  if (indirectVertex)
+ vtxBase = indirectVertex;
+  else
+ vtxBase = loadImm(nullptr, baseVertex);
+
+  if (indirectOffset)
+ indirectOffset = mkOp2v(OP_SHL, TYPE_U32, getSSA(4, FILE_ADDRESS), 
indirectOffset, mkImm(4));
+
+  vtxBase = mkOp2v(OP_ADD, TYPE_U32, getSSA(4, FILE_ADDRESS), outBase, 
vtxBase);
+
+  for (auto i = 0u; i < insn->num_components; ++i) {
+ uint32_t address = info->out[idx].slot[nir_intrinsic_component(insn) 
+ i] * 4;
+ Symbol *sym = mkSymbol(FILE_SHADER_OUTPUT, 0, dType, address);
+ Instruction *ld = mkLoad(dType, newDefs[i], sym, indirectOffset);
+ ld->setIndirect(0, 1, vtxBase);
+ ld->perPatch = info->in[idx].patch;
+  }
+  break;
+   }
case nir_intrinsic_emit_vertex:
case nir_intrinsic_end_primitive: {
   auto idx = nir_intrinsic_stream_id(insn);
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 29/30] nvir/nir: add memory barriers

2018-01-07 Thread Karol Herbst
Signed-off-by: Karol Herbst 
---
 src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
index 0c19cb953d..08490a39e9 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
@@ -448,6 +448,10 @@ Converter::getSubOp(nir_intrinsic_op op)
CASE_OP_INTR_ATOM(and, AND);
CASE_OP_INTR_ATOM(comp_swap, CAS);
CASE_OP_INTR_ATOM(exchange, EXCH);
+   case nir_intrinsic_memory_barrier:
+  return NV50_IR_SUBOP_MEMBAR(M, GL);
+   case nir_intrinsic_memory_barrier_shared:
+  return NV50_IR_SUBOP_MEMBAR(M, CTA);
CASE_OP_INTR_ATOM(or, OR);
case nir_intrinsic_image_atomic_max:
CASE_OP_INTR_ATOM_S(imax, MAX);
@@ -2035,6 +2039,13 @@ Converter::visit(nir_intrinsic_instr *insn)
   bar->subOp = NV50_IR_SUBOP_BAR_SYNC;
   break;
}
+   case nir_intrinsic_memory_barrier:
+   case nir_intrinsic_memory_barrier_shared: {
+  Instruction *bar = mkOp(OP_MEMBAR, TYPE_NONE, NULL);
+  bar->fixed = 1;
+  bar->subOp = getSubOp(op);
+  break;
+   }
default:
   ERROR("unknown nir_intrinsic_op %s\n", nir_intrinsic_infos[op].name);
   return false;
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 28/30] nvir/nir: implement images

2018-01-07 Thread Karol Herbst
v3: fix compiler warnings

Signed-off-by: Karol Herbst 
---
 .../drivers/nouveau/codegen/nv50_ir_from_nir.cpp   | 287 +++--
 1 file changed, 269 insertions(+), 18 deletions(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
index 6d5059f07e..0c19cb953d 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
@@ -86,6 +86,8 @@ public:
LValues& convert(nir_register *);
LValues& convert(nir_ssa_def *);
 
+   ImgFormat convertGLImgFormat(GLuint);
+
// nir_alu_src needs special handling due to neg and abs modifiers
Value* getSrc(nir_alu_src *, uint8_t component = 0);
Value* getSrc(nir_register *, uint8_t);
@@ -135,6 +137,7 @@ public:
 
// tex stuff
Value* applyProjection(Value *src, Value *proj);
+   unsigned int getNIRArgCount(TexInstruction::Target&);
 private:
nir_shader *nir;
 
@@ -428,28 +431,31 @@ Converter::getSubOp(nir_op op)
}
 }
 
+#define CASE_OP_INTR_ATOM(nir, nvir) \
+   case nir_intrinsic_image_atomic_ ## nir : \
+   case nir_intrinsic_shared_atomic_ ## nir : \
+   case nir_intrinsic_ssbo_atomic_ ## nir : \
+  return NV50_IR_SUBOP_ATOM_ ## nvir
+#define CASE_OP_INTR_ATOM_S(nir, nvir) \
+   case nir_intrinsic_shared_atomic_ ## nir : \
+   case nir_intrinsic_ssbo_atomic_ ## nir : \
+  return NV50_IR_SUBOP_ATOM_ ## nvir
 int
 Converter::getSubOp(nir_intrinsic_op op)
 {
switch (op) {
-   case nir_intrinsic_ssbo_atomic_add:
-  return NV50_IR_SUBOP_ATOM_ADD;
-   case nir_intrinsic_ssbo_atomic_and:
-  return NV50_IR_SUBOP_ATOM_AND;
-   case nir_intrinsic_ssbo_atomic_comp_swap:
-  return NV50_IR_SUBOP_ATOM_CAS;
-   case nir_intrinsic_ssbo_atomic_exchange:
-  return NV50_IR_SUBOP_ATOM_EXCH;
-   case nir_intrinsic_ssbo_atomic_or:
-  return NV50_IR_SUBOP_ATOM_OR;
-   case nir_intrinsic_ssbo_atomic_imax:
-   case nir_intrinsic_ssbo_atomic_umax:
-  return NV50_IR_SUBOP_ATOM_MAX;
-   case nir_intrinsic_ssbo_atomic_imin:
-   case nir_intrinsic_ssbo_atomic_umin:
-  return NV50_IR_SUBOP_ATOM_MIN;
-   case nir_intrinsic_ssbo_atomic_xor:
-  return NV50_IR_SUBOP_ATOM_XOR;
+   CASE_OP_INTR_ATOM(add, ADD);
+   CASE_OP_INTR_ATOM(and, AND);
+   CASE_OP_INTR_ATOM(comp_swap, CAS);
+   CASE_OP_INTR_ATOM(exchange, EXCH);
+   CASE_OP_INTR_ATOM(or, OR);
+   case nir_intrinsic_image_atomic_max:
+   CASE_OP_INTR_ATOM_S(imax, MAX);
+   CASE_OP_INTR_ATOM_S(umax, MAX);
+   case nir_intrinsic_image_atomic_min:
+   CASE_OP_INTR_ATOM_S(imin, MIN);
+   CASE_OP_INTR_ATOM_S(umin, MIN);
+   CASE_OP_INTR_ATOM(xor, XOR);
case nir_intrinsic_vote_all:
   return NV50_IR_SUBOP_VOTE_ALL;
case nir_intrinsic_vote_any:
@@ -462,6 +468,8 @@ Converter::getSubOp(nir_intrinsic_op op)
   return 0;
}
 }
+#undef CASE_OP_INTR_ATOM
+#undef CASE_OP_INTR_ATOM_S
 
 CondCode
 Converter::getCondCode(nir_op op)
@@ -1487,6 +1495,68 @@ Converter::convert(nir_intrinsic_op intr)
}
 }
 
+ImgFormat
+Converter::convertGLImgFormat(GLuint format)
+{
+#define FMT_CASE(a, b) \
+  case GL_ ## a: return nv50_ir::FMT_ ## b
+
+   switch (format) {
+   FMT_CASE(NONE, NONE);
+
+   FMT_CASE(RGBA32F, RGBA32F);
+   FMT_CASE(RGBA16F, RGBA16F);
+   FMT_CASE(RG32F, RG32F);
+   FMT_CASE(RG16F, RG16F);
+   FMT_CASE(R11F_G11F_B10F, R11G11B10F);
+   FMT_CASE(R32F, R32F);
+   FMT_CASE(R16F, R16F);
+
+   FMT_CASE(RGBA32UI, RGBA32UI);
+   FMT_CASE(RGBA16UI, RGBA16UI);
+   FMT_CASE(RGB10_A2UI, RGB10A2UI);
+   FMT_CASE(RGBA8UI, RGBA8UI);
+   FMT_CASE(RG32UI, RG32UI);
+   FMT_CASE(RG16UI, RG16UI);
+   FMT_CASE(RG8UI, RG8UI);
+   FMT_CASE(R32UI, R32UI);
+   FMT_CASE(R16UI, R16UI);
+   FMT_CASE(R8UI, R8UI);
+
+   FMT_CASE(RGBA32I, RGBA32I);
+   FMT_CASE(RGBA16I, RGBA16I);
+   FMT_CASE(RGBA8I, RGBA8I);
+   FMT_CASE(RG32I, RG32I);
+   FMT_CASE(RG16I, RG16I);
+   FMT_CASE(RG8I, RG8I);
+   FMT_CASE(R32I, R32I);
+   FMT_CASE(R16I, R16I);
+   FMT_CASE(R8I, R8I);
+
+   FMT_CASE(RGBA16, RGBA16);
+   FMT_CASE(RGB10_A2, RGB10A2);
+   FMT_CASE(RGBA8, RGBA8);
+   FMT_CASE(RG16, RG16);
+   FMT_CASE(RG8, RG8);
+   FMT_CASE(R16, R16);
+   FMT_CASE(R8, R8);
+
+   FMT_CASE(RGBA16_SNORM, RGBA16_SNORM);
+   FMT_CASE(RGBA8_SNORM, RGBA8_SNORM);
+   FMT_CASE(RG16_SNORM, RG16_SNORM);
+   FMT_CASE(RG8_SNORM, RG8_SNORM);
+   FMT_CASE(R16_SNORM, R16_SNORM);
+   FMT_CASE(R8_SNORM, R8_SNORM);
+
+   FMT_CASE(BGRA_INTEGER, BGRA8);
+   default:
+  ERROR("unknown format %x\n", format);
+  assert(false);
+  return nv50_ir::FMT_NONE;
+   }
+#undef FMT_CASE
+}
+
 bool
 Converter::visit(nir_intrinsic_instr *insn)
 {
@@ -1766,6 +1836,28 @@ Converter::visit(nir_intrinsic_instr *insn)
   info->io.globalAccess |= 0x1;
   break;
}
+   case nir_intrinsic_shared_atomic_add:
+   case nir_intrinsic_shared_atomic_and:
+   case nir_intrinsic_shared_atomic_comp_swap:
+   case nir_intrinsic_shared_atomic_exchange:
+   case 

[Mesa-dev] [PATCH v3 27/30] nvir/nir: implement ssbo intrinsics

2018-01-07 Thread Karol Herbst
Signed-off-by: Karol Herbst 
---
 .../drivers/nouveau/codegen/nv50_ir_from_nir.cpp   | 97 ++
 1 file changed, 97 insertions(+)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
index fe06b4ee8c..6d5059f07e 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
@@ -432,6 +432,24 @@ int
 Converter::getSubOp(nir_intrinsic_op op)
 {
switch (op) {
+   case nir_intrinsic_ssbo_atomic_add:
+  return NV50_IR_SUBOP_ATOM_ADD;
+   case nir_intrinsic_ssbo_atomic_and:
+  return NV50_IR_SUBOP_ATOM_AND;
+   case nir_intrinsic_ssbo_atomic_comp_swap:
+  return NV50_IR_SUBOP_ATOM_CAS;
+   case nir_intrinsic_ssbo_atomic_exchange:
+  return NV50_IR_SUBOP_ATOM_EXCH;
+   case nir_intrinsic_ssbo_atomic_or:
+  return NV50_IR_SUBOP_ATOM_OR;
+   case nir_intrinsic_ssbo_atomic_imax:
+   case nir_intrinsic_ssbo_atomic_umax:
+  return NV50_IR_SUBOP_ATOM_MAX;
+   case nir_intrinsic_ssbo_atomic_imin:
+   case nir_intrinsic_ssbo_atomic_umin:
+  return NV50_IR_SUBOP_ATOM_MIN;
+   case nir_intrinsic_ssbo_atomic_xor:
+  return NV50_IR_SUBOP_ATOM_XOR;
case nir_intrinsic_vote_all:
   return NV50_IR_SUBOP_VOTE_ALL;
case nir_intrinsic_vote_any:
@@ -1696,6 +1714,85 @@ Converter::visit(nir_intrinsic_instr *insn)
   }
   break;
}
+   case nir_intrinsic_get_buffer_size: {
+  LValues  = convert(>dest);
+  const DataType dType = getDType(insn);
+  Value *indirectBuffer;
+  uint32_t buffer = getIndirect(>src[0], 0, );
+
+  Symbol *sym = mkSymbol(FILE_MEMORY_BUFFER, buffer, dType, 0);
+  mkOp1(OP_BUFQ, dType, newDefs[0], sym)->setIndirect(0, 0, 
indirectBuffer);
+  info->io.globalAccess |= 0x2;
+  break;
+   }
+   case nir_intrinsic_store_ssbo: {
+  DataType sType = getSType(insn->src[0], false, false);
+  Value *indirectBuffer;
+  Value *indirectOffset;
+  uint32_t buffer = getIndirect(>src[1], 0, );
+  uint32_t offset = getIndirect(>src[2], 0, );
+
+  for (auto i = 0u; i < insn->num_components; ++i) {
+ if (!((1u << i) & nir_intrinsic_write_mask(insn)))
+continue;
+ Symbol *sym = mkSymbol(FILE_MEMORY_BUFFER, buffer, sType, offset + i 
* typeSizeof(sType));
+ mkStore(OP_STORE, sType, sym, indirectOffset, getSrc(>src[0], 
i))->setIndirect(0, 1, indirectBuffer);
+  }
+  info->io.globalAccess |= 0x2;
+  break;
+   }
+   case nir_intrinsic_load_ssbo: {
+  const DataType dType = getDType(insn);
+  LValues  = convert(>dest);
+  Value *indirectBuffer;
+  Value *indirectOffset;
+  uint32_t buffer = getIndirect(>src[0], 0, );
+  uint32_t offset = getIndirect(>src[1], 0, );
+
+  for (auto i = 0u; i < insn->num_components; ++i) {
+ if (typeSizeof(dType) > 4) {
+Value *temp0 = getSSA();
+Value *temp1 = getSSA();
+Symbol *sym0 = mkSymbol(FILE_MEMORY_BUFFER, buffer, TYPE_U32, 
offset + i * 8);
+Symbol *sym1 = mkSymbol(FILE_MEMORY_BUFFER, buffer, TYPE_U32, 
offset + i * 8 + 4);
+mkLoad(TYPE_U32, temp0, sym0, indirectOffset)->setIndirect(0, 1, 
indirectBuffer);
+mkLoad(TYPE_U32, temp1, sym1, indirectOffset)->setIndirect(0, 1, 
indirectBuffer);
+mkOp2(OP_MERGE, dType, newDefs[i], temp0, temp1);
+ } else {
+Symbol *sym = mkSymbol(FILE_MEMORY_BUFFER, buffer, dType, offset + 
i * 4);
+mkLoad(dType, newDefs[i], sym, indirectOffset)->setIndirect(0, 1, 
indirectBuffer);
+ }
+  }
+  info->io.globalAccess |= 0x1;
+  break;
+   }
+   case nir_intrinsic_ssbo_atomic_add:
+   case nir_intrinsic_ssbo_atomic_and:
+   case nir_intrinsic_ssbo_atomic_comp_swap:
+   case nir_intrinsic_ssbo_atomic_exchange:
+   case nir_intrinsic_ssbo_atomic_or:
+   case nir_intrinsic_ssbo_atomic_imax:
+   case nir_intrinsic_ssbo_atomic_imin:
+   case nir_intrinsic_ssbo_atomic_umax:
+   case nir_intrinsic_ssbo_atomic_umin:
+   case nir_intrinsic_ssbo_atomic_xor: {
+  const DataType dType = getDType(insn);
+  LValues  = convert(>dest);
+  Value *indirectBuffer;
+  Value *indirectOffset;
+  uint32_t buffer = getIndirect(>src[0], 0, );
+  uint32_t offset = getIndirect(>src[1], 0, );
+
+  Symbol *sym = mkSymbol(FILE_MEMORY_BUFFER, buffer, dType, offset);
+  Instruction *atom = mkOp2(OP_ATOM, dType, newDefs[0], sym, 
getSrc(>src[2], 0));
+  if (op == nir_intrinsic_ssbo_atomic_comp_swap)
+ atom->setSrc(2, getSrc(>src[3], 0));
+  atom->setIndirect(0, 0, indirectOffset);
+  atom->subOp = getSubOp(op);
+
+  info->io.globalAccess |= 0x2;
+  break;
+   }
default:
   ERROR("unknown nir_intrinsic_op %s\n", nir_intrinsic_infos[op].name);
   return false;
-- 
2.14.3

___
mesa-dev 

[Mesa-dev] [PATCH v3 26/30] nvir/nir: implement nir_intrinsic_load_ubo

2018-01-07 Thread Karol Herbst
Signed-off-by: Karol Herbst 
---
 src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
index 905a18d9d9..fe06b4ee8c 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
@@ -1682,6 +1682,20 @@ Converter::visit(nir_intrinsic_instr *insn)
   mkOp1(getOperation(op), TYPE_U32, NULL, mkImm(idx))->fixed = 1;
   break;
}
+   case nir_intrinsic_load_ubo: {
+  const DataType dType = getDType(insn);
+  LValues  = convert(>dest);
+  Value *indirectIndex;
+  Value *indirectOffset;
+  uint32_t index = getIndirect(>src[0], 0, ) + 1;
+  uint32_t offset = getIndirect(>src[1], 0, );
+
+  for (auto i = 0u; i < insn->num_components; ++i) {
+ Symbol *sym = mkSymbol(FILE_MEMORY_CONST, index, dType, offset + i * 
4);
+ mkLoad(dType, newDefs[i], sym, indirectOffset)->setIndirect(0, 1, 
indirectIndex);
+  }
+  break;
+   }
default:
   ERROR("unknown nir_intrinsic_op %s\n", nir_intrinsic_infos[op].name);
   return false;
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 21/30] nvir/nir: implement nir_instr_type_tex

2018-01-07 Thread Karol Herbst
a lot of those fields are not valid for a lot of tex ops. Not quite sure if
it's worth the effort to check for those or just keep it like that. It seems
to kind of work.

v2: reworked offset handling
add tex support with indirect R/S arguments
handle GLSL_SAMPLER_DIM_EXTERNAL
drop reference in convert(glsl_sampler_dim&, bool, bool)
fix tg4 component selection

Signed-off-by: Karol Herbst 
---
 .../drivers/nouveau/codegen/nv50_ir_from_nir.cpp   | 219 +
 1 file changed, 219 insertions(+)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
index 3dbe719a77..3fe91462c7 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
@@ -77,6 +77,7 @@ public:
 
Converter(Program *, nir_shader *, nv50_ir_prog_info *);
 
+   TexTarget convert(glsl_sampler_dim, bool isArray, bool isShadow);
LValues& convert(nir_alu_dest *);
BasicBlock* convert(nir_block *);
LValues& convert(nir_dest *);
@@ -109,6 +110,7 @@ public:
bool visit(nir_load_const_instr*);
bool visit(nir_loop *);
bool visit(nir_ssa_undef_instr *);
+   bool visit(nir_tex_instr *);
 
bool run();
 
@@ -123,9 +125,13 @@ public:
DataType getSType(nir_src&, bool isFloat, bool isSigned);
 
operation getOperation(nir_op);
+   operation getOperation(nir_texop);
operation preOperationNeeded(nir_op);
int getSubOp(nir_op);
CondCode getCondCode(nir_op);
+
+   // tex stuff
+   Value* applyProjection(Value *src, Value *proj);
 private:
nir_shader *nir;
 
@@ -351,6 +357,36 @@ Converter::getOperation(nir_op op)
}
 }
 
+operation
+Converter::getOperation(nir_texop op)
+{
+   switch (op) {
+   case nir_texop_tex:
+  return OP_TEX;
+   case nir_texop_lod:
+  return OP_TXLQ;
+   case nir_texop_txb:
+  return OP_TXB;
+   case nir_texop_txd:
+  return OP_TXD;
+   case nir_texop_txf:
+   case nir_texop_txf_ms:
+  return OP_TXF;
+   case nir_texop_tg4:
+  return OP_TXG;
+   case nir_texop_txl:
+  return OP_TXL;
+   case nir_texop_query_levels:
+   case nir_texop_texture_samples:
+   case nir_texop_txs:
+  return OP_TXQ;
+   default:
+  ERROR("couldn't get operation for nir_texop %u\n", op);
+  assert(false);
+  return OP_NOP;
+   }
+}
+
 operation
 Converter::preOperationNeeded(nir_op op)
 {
@@ -1265,6 +1301,10 @@ Converter::visit(nir_instr *insn)
   if (!visit(nir_instr_as_alu(insn)))
  return false;
   break;
+   case nir_instr_type_tex:
+  if (!visit(nir_instr_as_tex(insn)))
+ return false;
+  break;
case nir_instr_type_intrinsic:
   if (!visit(nir_instr_as_intrinsic(insn)))
  return false;
@@ -1866,6 +1906,185 @@ Converter::visit(nir_ssa_undef_instr *insn)
return true;
 }
 
+#define CASE_SAMPLER(ty) \
+   case GLSL_SAMPLER_DIM_ ## ty : \
+  if (isArray && !isShadow) \
+ return TEX_TARGET_ ## ty ## _ARRAY; \
+  else if (!isArray && isShadow) \
+ return TEX_TARGET_## ty ## _SHADOW; \
+  else if (isArray && isShadow) \
+ return TEX_TARGET_## ty ## _ARRAY_SHADOW; \
+  else \
+ return TEX_TARGET_ ## ty
+
+TexTarget
+Converter::convert(glsl_sampler_dim dim, bool isArray, bool isShadow)
+{
+   switch (dim) {
+   CASE_SAMPLER(1D);
+   CASE_SAMPLER(2D);
+   CASE_SAMPLER(CUBE);
+   case GLSL_SAMPLER_DIM_3D:
+  return TEX_TARGET_3D;
+   case GLSL_SAMPLER_DIM_MS:
+  if (isArray)
+ return TEX_TARGET_2D_MS_ARRAY;
+  return TEX_TARGET_2D_MS;
+   case GLSL_SAMPLER_DIM_RECT:
+  if (isShadow)
+ return TEX_TARGET_RECT_SHADOW;
+  return TEX_TARGET_RECT;
+   case GLSL_SAMPLER_DIM_BUF:
+  return TEX_TARGET_BUFFER;
+   case GLSL_SAMPLER_DIM_EXTERNAL:
+  return TEX_TARGET_2D;
+   default:
+  ERROR("unknown glsl_sampler_dim %u\n", dim);
+  assert(false);
+  return TEX_TARGET_COUNT;
+   }
+}
+#undef CASE_SAMPLER
+
+Value*
+Converter::applyProjection(Value *src, Value *proj)
+{
+   if (!proj)
+  return src;
+   return mkOp2v(OP_MUL, TYPE_F32, getScratch(), src, proj);
+}
+
+bool
+Converter::visit(nir_tex_instr *insn)
+{
+   switch (insn->op) {
+   case nir_texop_lod:
+   case nir_texop_query_levels:
+   case nir_texop_tex:
+   case nir_texop_texture_samples:
+   case nir_texop_tg4:
+   case nir_texop_txb:
+   case nir_texop_txd:
+   case nir_texop_txf:
+   case nir_texop_txf_ms:
+   case nir_texop_txl:
+   case nir_texop_txs: {
+  LValues  = convert(>dest);
+  std::vector srcs;
+  std::vector defs;
+  std::vector offsets;
+  uint8_t mask = 0;
+  bool lz = false;
+  Value *proj = nullptr;
+  TexInstruction::Target target = convert(insn->sampler_dim, 
insn->is_array, insn->is_shadow);
+  operation op = getOperation(insn->op);
+
+  int biasIdx = nir_tex_instr_src_index(insn, 

[Mesa-dev] [PATCH v3 24/30] nvir/nir: implement variable indexing

2018-01-07 Thread Karol Herbst
we store those arrays in local memory and reserve some space for each of the
arrays. The arrays are stored in a packed format, because we know quite easily
the context of each index. We don't do that in TGSI so far.

This causes various issues to come up in the MemoryOpt pass, because ld/st with
indirects aren't guarenteed to be aligned to 0x10 anymore.

v3: use fixed size vec4 arrays until we fix MemoryOpt

Signed-off-by: Karol Herbst 
---
 .../drivers/nouveau/codegen/nv50_ir_from_nir.cpp   | 60 ++
 1 file changed, 60 insertions(+)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
index ede6ee0119..82f388c2ac 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
@@ -73,6 +73,7 @@ public:
typedef decltype(nir_ssa_def().index) NirSSADefIdx;
typedef decltype(nir_ssa_def().bit_size) NirSSADefBitSize;
typedef std::unordered_map NirDefMap;
+   typedef std::unordered_map NirArrayLMemOffsets;
typedef std::unordered_map 
NirBlockMap;
 
Converter(Program *, nir_shader *, nv50_ir_prog_info *);
@@ -139,6 +140,7 @@ private:
 
NirDefMap ssaDefs;
NirDefMap regDefs;
+   NirArrayLMemOffsets regToLmemOffset;
NirBlockMap blocks;
unsigned int curLoopDepth;
 
@@ -1096,6 +1098,7 @@ bool Converter::assignSlots() {
 bool
 Converter::parseNIR()
 {
+   info->bin.tlsSpace = 0;
info->io.clipDistances = nir->info.clip_distance_array_size;
info->io.cullDistances = nir->info.cull_distance_array_size;
 
@@ -1183,6 +1186,17 @@ Converter::visit(nir_function *function)
   break;
}
 
+   nir_foreach_register(reg, >impl->registers) {
+  if (reg->num_array_elems) {
+ // TODO: packed variables would be nice, but MemoryOpt fails
+ // uint32_t size = reg->num_components * reg->num_array_elems * 
(reg->bit_size / 8);
+ uint32_t size = 4 * reg->num_array_elems * (reg->bit_size / 8);
+ // reserve some lmem
+ regToLmemOffset[reg->index] = info->bin.tlsSpace;
+ info->bin.tlsSpace += size;
+  }
+   }
+
nir_index_ssa_defs(function->impl);
foreach_list_typed(nir_cf_node, node, node, >impl->body) {
   if (!visit(node))
@@ -1778,6 +1792,51 @@ Converter::visit(nir_alu_instr *insn)
 *   2. they basically just merge multiple values into one data type
 */
CASE_OPFI(mov):
+  if (!insn->dest.dest.is_ssa && insn->dest.dest.reg.reg->num_array_elems) 
{
+ nir_reg_dest& reg = insn->dest.dest.reg;
+ auto goffset = regToLmemOffset[reg.reg->index];
+ auto comps = reg.reg->num_components;
+ auto size = reg.reg->bit_size / 8;
+ auto csize = 0x10; // TODO after fixing MemoryOpts: comps * size;
+ auto aoffset = csize * reg.base_offset;
+ Value *indirect = nullptr;
+
+ if (reg.indirect)
+indirect = mkOp2v(OP_MUL, TYPE_U32, getSSA(4, FILE_ADDRESS), 
getSrc(reg.indirect, 0), mkImm(csize));
+
+ for (auto i = 0u; i < comps; ++i) {
+if (!((1u << i) & insn->dest.write_mask))
+   continue;
+
+Symbol *sym = mkSymbol(FILE_MEMORY_LOCAL, 0, dType, goffset + 
aoffset + i * size);
+mkStore(OP_STORE, dType, sym, indirect, getSrc(>src[0], i));
+ }
+ break;
+  } else if (!insn->src[0].src.is_ssa && 
insn->src[0].src.reg.reg->num_array_elems) {
+ LValues  = convert(>dest);
+ nir_reg_src& reg = insn->src[0].src.reg;
+ auto goffset = regToLmemOffset[reg.reg->index];
+ // auto comps = reg.reg->num_components;
+ auto size = reg.reg->bit_size / 8;
+ auto csize = 0x10; // TODO after fixing MemoryOpts: comps * size;
+ auto aoffset = csize * reg.base_offset;
+ Value *indirect = nullptr;
+
+ if (reg.indirect)
+indirect = mkOp2v(OP_MUL, TYPE_U32, getSSA(4, FILE_ADDRESS), 
getSrc(reg.indirect, 0), mkImm(csize));
+
+ for (auto i = 0u; i < newDefs.size(); ++i) {
+Symbol *sym = mkSymbol(FILE_MEMORY_LOCAL, 0, dType, goffset + 
aoffset + i * size);
+mkLoad(dType, newDefs[i], sym, indirect);
+ }
+ break;
+  } else {
+ LValues  = convert(>dest);
+ for (LValues::size_type c = 0u; c < newDefs.size(); ++c) {
+mkMov(newDefs[c], getSrc(>src[0], c), dType);
+ }
+  }
+  break;
case nir_op_vec2:
case nir_op_vec3:
case nir_op_vec4: {
@@ -2178,6 +2237,7 @@ Converter::run()
   NIR_PASS(progress, nir, nir_opt_dead_cf);
} while (progress);
 
+   NIR_PASS_V(nir, nir_lower_locals_to_regs);
NIR_PASS_V(nir, nir_remove_dead_variables, nir_var_local);
NIR_PASS_V(nir, nir_convert_from_ssa, true);
 
-- 
2.14.3


[Mesa-dev] [PATCH v3 16/30] nvir/nir: implement nir_intrinsic_store_(per_vertex_)output

2018-01-07 Thread Karol Herbst
v3: add workaround for RA issues
indirects have to be multiplied by 0x10
fix indirect access

Signed-off-by: Karol Herbst 
---
 .../drivers/nouveau/codegen/nv50_ir_from_nir.cpp   | 54 ++
 1 file changed, 54 insertions(+)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
index 75d74a6379..74edec0c97 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
@@ -1093,6 +1093,11 @@ Converter::visit(nir_function *function)
 
setPosition(entry, true);
 
+   if (info->io.genUserClip > 0) {
+  for (int c = 0; c < 4; ++c)
+ clipVtx[c] = getScratch();
+   }
+
switch (prog->getType()) {
case Program::TYPE_TESSELLATION_CONTROL:
   outBase = mkOp2v(
@@ -1119,6 +1124,8 @@ Converter::visit(nir_function *function)
bb->cfg.attach(>cfg, Graph::Edge::TREE);
setPosition(exit, true);
 
+   if (info->io.genUserClip > 0)
+  handleUserClipPlanes();
// TODO: for non main function this needs to be a OP_RETURN
mkOp(OP_EXIT, TYPE_NONE, NULL)->terminator = 1;
return true;
@@ -1339,6 +1346,53 @@ Converter::visit(nir_intrinsic_instr *insn)
   }
   break;
}
+   case nir_intrinsic_store_output:
+   case nir_intrinsic_store_per_vertex_output: {
+  Value *indirect;
+  auto idx = nir_intrinsic_base(insn) + getIndirect(>src[op == 
nir_intrinsic_store_output ? 1 : 2], 0, );
+  uint8_t offset = insn->const_index[2];
+
+  if (indirect)
+ // we have to multiply with 16
+ mkOp2(OP_MUL, TYPE_U32, indirect, indirect, loadImm(getScratch(), 
16));
+
+  for (auto i = 0u; i < insn->num_components; ++i) {
+ if (!((1u << i) & nir_intrinsic_write_mask(insn)))
+continue;
+
+ Value *src = getSrc(>src[0], i);
+ switch (prog->getType()) {
+ case Program::TYPE_FRAGMENT: {
+if (info->out[idx].sn == TGSI_SEMANTIC_POSITION) {
+   // TGSI uses a different interface than NIR, TGSI stores that 
value in the z component, NIR in X
+   offset += 2;
+   src = mkOp1v(OP_SAT, TYPE_F32, getScratch(), src);
+}
+break;
+ }
+ case Program::TYPE_VERTEX: {
+if (info->io.genUserClip > 0) {
+   mkMov(clipVtx[i], src);
+   src = clipVtx[i];
+}
+break;
+ }
+ default:
+break;
+ }
+
+ assert(i + offset < 4);
+ uint32_t address = info->out[idx].slot[i + offset];
+
+ // TODO: RA doesn't like exorts without moving the sources...
+ mkStore(OP_EXPORT,
+ TYPE_F32,
+ mkSymbol(FILE_SHADER_OUTPUT, 0, TYPE_U32, address * 4),
+ indirect,
+ mkMov(getSSA(), src)->getDef(0))->perPatch = 
info->out[idx].patch;
+  }
+  break;
+   }
default:
   ERROR("unknown nir_intrinsic_op %s\n", nir_intrinsic_infos[op].name);
   return false;
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 18/30] nvir/nir: implement intrinsic_discard(_if)

2018-01-07 Thread Karol Herbst
Signed-off-by: Karol Herbst 
---
 src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
index 34dbe86551..fe4a561975 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
@@ -1425,6 +1425,21 @@ Converter::visit(nir_intrinsic_instr *insn)
   }
   break;
}
+   case nir_intrinsic_discard:
+  mkOp(OP_DISCARD, TYPE_NONE, NULL);
+  break;
+   case nir_intrinsic_discard_if: {
+  // we get a nir boolean value
+  Value *pred = new_LValue(func, FILE_PREDICATE);
+  if (insn->num_components > 1) {
+ ERROR("nir_intrinsic_discard_if only with 1 component supported!\n");
+ assert(false);
+ return false;
+  }
+  mkCmp(OP_SET, CC_NE, TYPE_U8, pred, TYPE_U32, getSrc(>src[0], 0), 
zero);
+  mkOp(OP_DISCARD, TYPE_NONE, NULL)->setPredicate(CC_P, pred);
+  break;
+   }
default:
   ERROR("unknown nir_intrinsic_op %s\n", nir_intrinsic_infos[op].name);
   return false;
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 22/30] nvir/nir: add getOperation for intrinsics

2018-01-07 Thread Karol Herbst
Signed-off-by: Karol Herbst 
---
 .../drivers/nouveau/codegen/nv50_ir_from_nir.cpp   | 24 ++
 1 file changed, 24 insertions(+)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
index 3fe91462c7..ae9604bfc2 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
@@ -125,9 +125,11 @@ public:
DataType getSType(nir_src&, bool isFloat, bool isSigned);
 
operation getOperation(nir_op);
+   operation getOperation(nir_intrinsic_op);
operation getOperation(nir_texop);
operation preOperationNeeded(nir_op);
int getSubOp(nir_op);
+   int getSubOp(nir_intrinsic_op);
CondCode getCondCode(nir_op);
 
// tex stuff
@@ -387,6 +389,17 @@ Converter::getOperation(nir_texop op)
}
 }
 
+operation
+Converter::getOperation(nir_intrinsic_op op)
+{
+   switch (op) {
+   default:
+  ERROR("couldn't get operation for nir_intrinsic_op %u\n", op);
+  assert(false);
+  return OP_NOP;
+   }
+}
+
 operation
 Converter::preOperationNeeded(nir_op op)
 {
@@ -409,6 +422,17 @@ Converter::getSubOp(nir_op op)
}
 }
 
+int
+Converter::getSubOp(nir_intrinsic_op op)
+{
+   switch (op) {
+   default:
+  ERROR("couldn't get subop for nir_intrinsic_op %u\n", op);
+  assert(false);
+  return 0;
+   }
+}
+
 CondCode
 Converter::getCondCode(nir_op op)
 {
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 25/30] nvir/nir: implement geometry shader nir_intrinsics

2018-01-07 Thread Karol Herbst
Signed-off-by: Karol Herbst 
---
 .../drivers/nouveau/codegen/nv50_ir_from_nir.cpp   | 31 ++
 1 file changed, 31 insertions(+)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
index 82f388c2ac..905a18d9d9 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
@@ -395,6 +395,10 @@ operation
 Converter::getOperation(nir_intrinsic_op op)
 {
switch (op) {
+   case nir_intrinsic_emit_vertex:
+  return OP_EMIT;
+   case nir_intrinsic_end_primitive:
+  return OP_RESTART;
default:
   ERROR("couldn't get operation for nir_intrinsic_op %u\n", op);
   assert(false);
@@ -1651,6 +1655,33 @@ Converter::visit(nir_intrinsic_instr *insn)
   mkOp3(OP_SHFL, dType, newDefs[0], getSrc(>src[0], 0), tmp, 
mkImm(0x1f))->subOp = NV50_IR_SUBOP_SHFL_IDX;
   break;
}
+   case nir_intrinsic_load_per_vertex_input: {
+  const DataType dType = getDType(insn);
+  LValues  = convert(>dest);
+  Value *indirectVertex;
+  Value *indirectOffset;
+  auto baseVertex = getIndirect(>src[0], 0, );
+  auto baseOffset = getIndirect(>src[1], 0, );
+
+  auto idx = nir_intrinsic_base(insn) + baseOffset;
+  Value *vtxBase = mkOp2v(OP_PFETCH, TYPE_U32, getSSA(4, FILE_ADDRESS), 
mkImm(baseVertex), indirectVertex);
+  if (indirectOffset)
+ indirectOffset = mkOp2v(OP_SHL, TYPE_U32, getSSA(4, FILE_ADDRESS), 
indirectOffset, mkImm(4));
+  for (auto i = 0u; i < insn->num_components; ++i) {
+ uint32_t address = info->in[idx].slot[nir_intrinsic_component(insn) + 
i] * 4;
+ Symbol *sym = mkSymbol(FILE_SHADER_INPUT, 0, dType, address);
+ Instruction *ld = mkLoad(dType, newDefs[i], sym, indirectOffset);
+ ld->setIndirect(0, 1, vtxBase);
+ ld->perPatch = info->in[idx].patch;
+  }
+  break;
+   }
+   case nir_intrinsic_emit_vertex:
+   case nir_intrinsic_end_primitive: {
+  auto idx = nir_intrinsic_stream_id(insn);
+  mkOp1(getOperation(op), TYPE_U32, NULL, mkImm(idx))->fixed = 1;
+  break;
+   }
default:
   ERROR("unknown nir_intrinsic_op %s\n", nir_intrinsic_infos[op].name);
   return false;
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 15/30] nvir/nir: implement nir_intrinsic_load_uniform

2018-01-07 Thread Karol Herbst
v2: use new getIndirect helper
fixes symbols for 64 bit types

Signed-off-by: Karol Herbst 
---
 src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp | 16 
 1 file changed, 16 insertions(+)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
index 0b67499ff6..75d74a6379 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
@@ -1323,6 +1323,22 @@ Converter::visit(nir_intrinsic_instr *insn)
nir_intrinsic_op op = insn->intrinsic;
 
switch (op) {
+   case nir_intrinsic_load_uniform: {
+  LValues  = convert(>dest);
+  auto offset = nir_intrinsic_base(insn);
+  const DataType dType = getDType(insn);
+  auto dTypeSize = std::max(4u, typeSizeof(dType));
+  Value *indirect;
+  auto coffset = offset + getIndirect(>src[0], 0, );
+  if (indirect)
+ // we have to multiply with 16
+ mkOp2(OP_MUL, TYPE_U32, indirect, indirect, loadImm(getScratch(), 
16));
+
+  for (auto i = 0; i < insn->num_components; ++i) {
+ mkLoad(dType, newDefs[i], mkSymbol(FILE_MEMORY_CONST, 0, dType, 
(coffset * 4 + i * (dTypeSize / 4)) * 4), nullptr)->setIndirect(0, 0, indirect);
+  }
+  break;
+   }
default:
   ERROR("unknown nir_intrinsic_op %s\n", nir_intrinsic_infos[op].name);
   return false;
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 20/30] nvir/nir: implement nir_ssa_undef_instr

2018-01-07 Thread Karol Herbst
v2: use mkOp

Signed-off-by: Karol Herbst 
---
 src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
index 21554bb50e..3dbe719a77 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
@@ -108,6 +108,7 @@ public:
bool visit(nir_jump_instr *);
bool visit(nir_load_const_instr*);
bool visit(nir_loop *);
+   bool visit(nir_ssa_undef_instr *);
 
bool run();
 
@@ -1276,6 +1277,10 @@ Converter::visit(nir_instr *insn)
   if (!visit(nir_instr_as_jump(insn)))
  return false;
   break;
+   case nir_instr_type_ssa_undef:
+  if (!visit(nir_instr_as_ssa_undef(insn)))
+ return false;
+  break;
default:
   ERROR("unknown nir_instr type %u\n", insn->type);
   return false;
@@ -1851,6 +1856,16 @@ Converter::visit(nir_alu_instr *insn)
 }
 #undef DEFAULT_CHECKS
 
+bool
+Converter::visit(nir_ssa_undef_instr *insn)
+{
+   LValues  = convert(>def);
+   for (auto i = 0u; i < insn->def.num_components; ++i) {
+  mkOp(OP_NOP, TYPE_NONE, newDefs[i]);
+   }
+   return true;
+}
+
 bool
 Converter::run()
 {
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 17/30] nvir/nir: implement nir_intrinsic_load_input

2018-01-07 Thread Karol Herbst
v3: and load_output

Signed-off-by: Karol Herbst 
---
 .../drivers/nouveau/codegen/nv50_ir_from_nir.cpp   | 32 ++
 1 file changed, 32 insertions(+)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
index 74edec0c97..34dbe86551 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
@@ -1393,6 +1393,38 @@ Converter::visit(nir_intrinsic_instr *insn)
   }
   break;
}
+   case nir_intrinsic_load_input:
+   case nir_intrinsic_load_output: {
+  const DataType dType = getDType(insn);
+  Value *indirect;
+  bool input = op == nir_intrinsic_load_input;
+
+  LValues  = convert(>dest);
+  auto idx = nir_intrinsic_base(insn) + getIndirect(>src[0], 0, 
);
+  uint8_t offset = insn->const_index[1];
+  nv50_ir_varying& vary = input ? info->in[idx] : info->out[idx];
+
+  if (indirect)
+ indirect = mkOp2v(OP_SHL, TYPE_U32, getSSA(4, FILE_ADDRESS), 
indirect, mkImm(4));
+
+  for (auto i = 0u; i < insn->num_components; ++i) {
+ assert(i + offset < 4);
+ uint32_t address = vary.slot[i + offset];
+ Symbol *sym = mkSymbol(input ? FILE_SHADER_INPUT : 
FILE_SHADER_OUTPUT, 0, dType, address * 4);
+ switch(prog->getType()) {
+ case Program::TYPE_FRAGMENT: {
+operation op;
+auto mode = translateInterpMode(, op);
+mkOp2(op, TYPE_F32, newDefs[i], sym, op == OP_PINTERP ? 
fp.position : nullptr)->setInterpolate(mode);
+break;
+ }
+ default:
+mkLoad(dType, newDefs[i], sym, indirect)->perPatch = vary.patch;
+break;
+ }
+  }
+  break;
+   }
default:
   ERROR("unknown nir_intrinsic_op %s\n", nir_intrinsic_infos[op].name);
   return false;
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 14/30] nvir/nir: implement nir_alu_instr handling

2018-01-07 Thread Karol Herbst
Signed-off-by: Karol Herbst 

v2: user bitfield_insert instead of bfi
rework switch helper macros
remove some lowering code (LoweringHelper is now used for this)
v3: add pack_half_2x16_split
add unpack_half_2x16_split_x/y
Signed-off-by: Karol Herbst 
---
 .../drivers/nouveau/codegen/nv50_ir_from_nir.cpp   | 486 -
 1 file changed, 485 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
index d7fe813e65..0b67499ff6 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
@@ -31,6 +31,31 @@
 #include 
 #include 
 
+#define CASE_OPFI(ni) \
+   case nir_op_f ## ni : \
+   case nir_op_i ## ni
+#define CASE_OPFIU(ni) \
+   case nir_op_f ## ni : \
+   case nir_op_i ## ni : \
+   case nir_op_u ## ni
+#define CASE_OPIU(ni) \
+   case nir_op_i ## ni : \
+   case nir_op_u ## ni
+
+#define CASE_OPFI_RET(ni, val) \
+   case nir_op_f ## ni : \
+   case nir_op_i ## ni : \
+  return val
+#define CASE_OPFIU_RET(ni, val) \
+   case nir_op_f ## ni : \
+   case nir_op_i ## ni : \
+   case nir_op_u ## ni : \
+  return val
+#define CASE_OPIU_RET(ni, val) \
+   case nir_op_i ## ni : \
+   case nir_op_u ## ni : \
+  return val
+
 static int
 type_size(const struct glsl_type *type)
 {
@@ -72,6 +97,7 @@ public:
bool assignSlots();
bool parseNIR();
 
+   bool visit(nir_alu_instr *);
bool visit(nir_block *);
bool visit(nir_cf_node *);
bool visit(nir_function *);
@@ -94,6 +120,10 @@ public:
std::vector getSTypes(nir_alu_instr*);
DataType getSType(nir_src&, bool isFloat, bool isSigned);
 
+   operation getOperation(nir_op);
+   operation preOperationNeeded(nir_op);
+   int getSubOp(nir_op);
+   CondCode getCondCode(nir_op);
 private:
nir_shader *nir;
 
@@ -103,6 +133,7 @@ private:
unsigned int curLoopDepth;
 
BasicBlock *exit;
+   Value *zero;
 
union {
   struct {
@@ -114,7 +145,10 @@ private:
 Converter::Converter(Program *prog, nir_shader *nir, nv50_ir_prog_info *info)
: ConverterCommon(prog, info),
  nir(nir),
- curLoopDepth(0) {}
+ curLoopDepth(0)
+{
+   zero = mkImm((uint32_t)0);
+}
 
 BasicBlock *
 Converter::convert(nir_block *block)
@@ -232,6 +266,136 @@ Converter::getSType(nir_src , bool isFloat, bool 
isSigned)
return typeOfSize(bitSize / 8, isFloat, isSigned);
 }
 
+operation
+Converter::getOperation(nir_op op)
+{
+   switch (op) {
+   // basic ops with float and int variants
+   CASE_OPFI_RET(abs, OP_ABS);
+   CASE_OPFI_RET(add, OP_ADD);
+   CASE_OPFI_RET(and, OP_AND);
+   CASE_OPFIU_RET(div, OP_DIV);
+   CASE_OPIU_RET(find_msb, OP_BFIND);
+   CASE_OPFIU_RET(max, OP_MAX);
+   CASE_OPFIU_RET(min, OP_MIN);
+   CASE_OPFIU_RET(mod, OP_MOD);
+   CASE_OPFI_RET(mul, OP_MUL);
+   CASE_OPIU_RET(mul_high, OP_MUL);
+   CASE_OPFI_RET(neg, OP_NEG);
+   CASE_OPFI_RET(not, OP_NOT);
+   CASE_OPFI_RET(or, OP_OR);
+   CASE_OPFI_RET(eq, OP_SET);
+   CASE_OPFIU_RET(ge, OP_SET);
+   CASE_OPFIU_RET(lt, OP_SET);
+   CASE_OPFI_RET(ne, OP_SET);
+   CASE_OPIU_RET(shr, OP_SHR);
+   CASE_OPFI_RET(sub, OP_SUB);
+   CASE_OPFI_RET(xor, OP_XOR);
+   case nir_op_fceil:
+  return OP_CEIL;
+   case nir_op_fcos:
+  return OP_COS;
+   case nir_op_f2f32:
+   case nir_op_f2f64:
+   case nir_op_f2i32:
+   case nir_op_f2i64:
+   case nir_op_f2u32:
+   case nir_op_f2u64:
+   case nir_op_i2f32:
+   case nir_op_i2f64:
+   case nir_op_i2i32:
+   case nir_op_i2i64:
+   case nir_op_u2f32:
+   case nir_op_u2f64:
+   case nir_op_u2u32:
+   case nir_op_u2u64:
+  return OP_CVT;
+   case nir_op_fddx:
+   case nir_op_fddx_coarse:
+   case nir_op_fddx_fine:
+  return OP_DFDX;
+   case nir_op_fddy:
+   case nir_op_fddy_coarse:
+   case nir_op_fddy_fine:
+  return OP_DFDY;
+   case nir_op_fexp2:
+  return OP_EX2;
+   case nir_op_ffloor:
+  return OP_FLOOR;
+   case nir_op_ffma:
+  return OP_FMA;
+   case nir_op_flog2:
+  return OP_LG2;
+   case nir_op_pack_64_2x32_split:
+  return OP_MERGE;
+   case nir_op_frcp:
+  return OP_RCP;
+   case nir_op_frsq:
+  return OP_RSQ;
+   case nir_op_fsat:
+  return OP_SAT;
+   case nir_op_ishl:
+  return OP_SHL;
+   case nir_op_fsin:
+  return OP_SIN;
+   case nir_op_fsqrt:
+  return OP_SQRT;
+   case nir_op_ftrunc:
+  return OP_TRUNC;
+   default:
+  ERROR("couldn't get operation for op %s\n", nir_op_infos[op].name);
+  assert(false);
+  return OP_NOP;
+   }
+}
+
+operation
+Converter::preOperationNeeded(nir_op op)
+{
+   switch (op) {
+   case nir_op_fcos:
+   case nir_op_fsin:
+  return OP_PRESIN;
+   default:
+  return OP_NOP;
+   }
+}
+
+int
+Converter::getSubOp(nir_op op)
+{
+   switch (op) {
+   CASE_OPIU_RET(mul_high, NV50_IR_SUBOP_MUL_HIGH);
+   default:
+  return 0;
+   }
+}
+
+CondCode

[Mesa-dev] [PATCH v3 23/30] nvir/nir: implement vote and ballot

2018-01-07 Thread Karol Herbst
v2: add vote_eq support
use the new subop intrinsic helper
add ballot
v3: add read_(first_)invocation

Signed-off-by: Karol Herbst 
---
 .../drivers/nouveau/codegen/nv50_ir_from_nir.cpp   | 41 ++
 1 file changed, 41 insertions(+)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
index ae9604bfc2..ede6ee0119 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
@@ -426,6 +426,12 @@ int
 Converter::getSubOp(nir_intrinsic_op op)
 {
switch (op) {
+   case nir_intrinsic_vote_all:
+  return NV50_IR_SUBOP_VOTE_ALL;
+   case nir_intrinsic_vote_any:
+  return NV50_IR_SUBOP_VOTE_ANY;
+   case nir_intrinsic_vote_eq:
+  return NV50_IR_SUBOP_VOTE_UNI;
default:
   ERROR("couldn't get subop for nir_intrinsic_op %u\n", op);
   assert(false);
@@ -1596,6 +1602,41 @@ Converter::visit(nir_intrinsic_instr *insn)
   }
   break;
}
+   case nir_intrinsic_vote_all:
+   case nir_intrinsic_vote_any:
+   case nir_intrinsic_vote_eq: {
+  LValues  = convert(>dest);
+  Value *pred = new_LValue(func, FILE_PREDICATE);
+  mkCmp(OP_SET, CC_NE, TYPE_U32, pred, TYPE_U32, getSrc(>src[0], 0), 
zero);
+  mkOp1(OP_VOTE, TYPE_U32, pred, pred)->subOp = getSubOp(op);
+  mkCvt(OP_CVT, TYPE_U32, newDefs[0], TYPE_U8, pred);
+  break;
+   }
+   case nir_intrinsic_ballot: {
+  LValues  = convert(>dest);
+  Value *pred = new_LValue(func, FILE_PREDICATE);
+  mkCmp(OP_SET, CC_NE, TYPE_U32, pred, TYPE_U32, getSrc(>src[0], 0), 
zero);
+  Instruction *ballot = mkOp1(OP_VOTE, TYPE_U32, getSSA(), pred);
+  ballot->subOp = NV50_IR_SUBOP_VOTE_ANY;
+  mkOp2(OP_MERGE, TYPE_U64, newDefs[0], ballot->getDef(0), 
loadImm(getSSA(), 0));
+  break;
+   }
+   case nir_intrinsic_read_first_invocation:
+   case nir_intrinsic_read_invocation: {
+  LValues  = convert(>dest);
+  const DataType dType = getDType(insn);
+  Value *tmp = getScratch();
+
+  if (op == nir_intrinsic_read_first_invocation) {
+ mkOp1(OP_VOTE, TYPE_U32, tmp, mkImm(1))->subOp = 
NV50_IR_SUBOP_VOTE_ANY;
+ mkOp2(OP_EXTBF, TYPE_U32, tmp, tmp, mkImm(0x2000))->subOp = 
NV50_IR_SUBOP_EXTBF_REV;
+ mkOp1(OP_BFIND, TYPE_U32, tmp, tmp)->subOp = NV50_IR_SUBOP_BFIND_SAMT;
+  } else
+ tmp = getSrc(>src[1], 0);
+
+  mkOp3(OP_SHFL, dType, newDefs[0], getSrc(>src[0], 0), tmp, 
mkImm(0x1f))->subOp = NV50_IR_SUBOP_SHFL_IDX;
+  break;
+   }
default:
   ERROR("unknown nir_intrinsic_op %s\n", nir_intrinsic_infos[op].name);
   return false;
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 19/30] nvir/nir: implement loading system values

2018-01-07 Thread Karol Herbst
v2: support more sys values
fixed a bug where for multi component reads all values ended up in x
v3: add load_patch_vertices_in

Signed-off-by: Karol Herbst 
---
 .../drivers/nouveau/codegen/nv50_ir_from_nir.cpp   | 87 ++
 1 file changed, 87 insertions(+)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
index fe4a561975..21554bb50e 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
@@ -80,6 +80,7 @@ public:
LValues& convert(nir_alu_dest *);
BasicBlock* convert(nir_block *);
LValues& convert(nir_dest *);
+   SVSemantic convert(nir_intrinsic_op);
LValues& convert(nir_register *);
LValues& convert(nir_ssa_def *);
 
@@ -1324,6 +1325,57 @@ Converter::visit(nir_jump_instr *insn)
return true;
 }
 
+SVSemantic
+Converter::convert(nir_intrinsic_op intr)
+{
+   switch (intr) {
+   case nir_intrinsic_load_base_vertex:
+  return SV_BASEVERTEX;
+   case nir_intrinsic_load_base_instance:
+  return SV_BASEINSTANCE;
+   case nir_intrinsic_load_draw_id:
+  return SV_DRAWID;
+   case nir_intrinsic_load_front_face:
+  return SV_FACE;
+   case nir_intrinsic_load_instance_id:
+  return SV_INSTANCE_ID;
+   case nir_intrinsic_load_invocation_id:
+  return SV_INVOCATION_ID;
+   case nir_intrinsic_load_local_group_size:
+  return SV_NTID;
+   case nir_intrinsic_load_local_invocation_id:
+  return SV_TID;
+   case nir_intrinsic_load_num_work_groups:
+  return SV_NCTAID;
+   case nir_intrinsic_load_patch_vertices_in:
+  return SV_VERTEX_COUNT;
+   case nir_intrinsic_load_primitive_id:
+  return SV_PRIMITIVE_ID;
+   case nir_intrinsic_load_sample_id:
+  return SV_SAMPLE_INDEX;
+   case nir_intrinsic_load_sample_mask_in:
+  return SV_SAMPLE_MASK;
+   case nir_intrinsic_load_sample_pos:
+  return SV_SAMPLE_POS;
+   case nir_intrinsic_load_subgroup_invocation:
+  return SV_LANEID;
+   case nir_intrinsic_load_tess_coord:
+  return SV_TESS_COORD;
+   case nir_intrinsic_load_tess_level_inner:
+  return SV_TESS_INNER;
+   case nir_intrinsic_load_tess_level_outer:
+  return SV_TESS_OUTER;
+   case nir_intrinsic_load_vertex_id:
+  return SV_VERTEX_ID;
+   case nir_intrinsic_load_work_group_id:
+  return SV_CTAID;
+   default:
+  ERROR("unknown SVSemantic for nir_intrinsic_op %s\n", 
nir_intrinsic_infos[intr].name);
+  assert(false);
+  return SV_LAST;
+   }
+}
+
 bool
 Converter::visit(nir_intrinsic_instr *insn)
 {
@@ -1440,6 +1492,41 @@ Converter::visit(nir_intrinsic_instr *insn)
   mkOp(OP_DISCARD, TYPE_NONE, NULL)->setPredicate(CC_P, pred);
   break;
}
+   case nir_intrinsic_load_base_vertex:
+   case nir_intrinsic_load_base_instance:
+   case nir_intrinsic_load_draw_id:
+   case nir_intrinsic_load_front_face:
+   case nir_intrinsic_load_instance_id:
+   case nir_intrinsic_load_invocation_id:
+   case nir_intrinsic_load_local_group_size:
+   case nir_intrinsic_load_local_invocation_id:
+   case nir_intrinsic_load_num_work_groups:
+   case nir_intrinsic_load_patch_vertices_in:
+   case nir_intrinsic_load_primitive_id:
+   case nir_intrinsic_load_sample_id:
+   case nir_intrinsic_load_sample_mask_in:
+   case nir_intrinsic_load_sample_pos:
+   case nir_intrinsic_load_subgroup_invocation:
+   case nir_intrinsic_load_tess_coord:
+   case nir_intrinsic_load_tess_level_inner:
+   case nir_intrinsic_load_tess_level_outer:
+   case nir_intrinsic_load_vertex_id:
+   case nir_intrinsic_load_work_group_id: {
+  SVSemantic sv = convert(op);
+  LValues  = convert(>dest);
+
+  for (auto i = 0u; i < insn->num_components; ++i) {
+ if (sv == SV_TID && info->prop.cp.numThreads[i] == 1) {
+loadImm(newDefs[i], 0u);
+continue;
+ }
+ Symbol *sym = mkSysVal(sv, i);
+ Instruction *rdsv = mkOp1(OP_RDSV, TYPE_U32, newDefs[i], sym);
+ if (sv == SV_TESS_OUTER || sv == SV_TESS_INNER)
+rdsv->perPatch = 1;
+  }
+  break;
+   }
default:
   ERROR("unknown nir_intrinsic_op %s\n", nir_intrinsic_infos[op].name);
   return false;
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 13/30] nvir/nir: add skeleton for nir_intrinsic_instr

2018-01-07 Thread Karol Herbst
Signed-off-by: Karol Herbst 
---
 .../drivers/nouveau/codegen/nv50_ir_from_nir.cpp  | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
index b130fc696b..d7fe813e65 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
@@ -77,6 +77,7 @@ public:
bool visit(nir_function *);
bool visit(nir_if *);
bool visit(nir_instr *);
+   bool visit(nir_intrinsic_instr *);
bool visit(nir_jump_instr *);
bool visit(nir_load_const_instr*);
bool visit(nir_loop *);
@@ -1087,6 +1088,10 @@ bool
 Converter::visit(nir_instr *insn)
 {
switch (insn->type) {
+   case nir_instr_type_intrinsic:
+  if (!visit(nir_instr_as_intrinsic(insn)))
+ return false;
+  break;
case nir_instr_type_load_const:
   if (!visit(nir_instr_as_load_const(insn)))
  return false;
@@ -1144,6 +1149,20 @@ Converter::visit(nir_jump_instr *insn)
return true;
 }
 
+bool
+Converter::visit(nir_intrinsic_instr *insn)
+{
+   nir_intrinsic_op op = insn->intrinsic;
+
+   switch (op) {
+   default:
+  ERROR("unknown nir_intrinsic_op %s\n", nir_intrinsic_infos[op].name);
+  return false;
+   }
+
+   return true;
+}
+
 bool
 Converter::run()
 {
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 02/30] nvir: print the shader type when dumping headers

2018-01-07 Thread Karol Herbst
this makes debugging the shader header a little easier

Signed-off-by: Karol Herbst 
---
 src/gallium/drivers/nouveau/nvc0/nvc0_program.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_program.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_program.c
index e6157f550d..fd65859516 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_program.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_program.c
@@ -541,6 +541,7 @@ nvc0_program_dump(struct nvc0_program *prog)
unsigned pos;
 
if (prog->type != PIPE_SHADER_COMPUTE) {
+  debug_printf("dumping HDR for type %i\n", prog->type);
   for (pos = 0; pos < ARRAY_SIZE(prog->hdr); ++pos)
  debug_printf("HDR[%02"PRIxPTR"] = 0x%08x\n",
   pos * sizeof(prog->hdr[0]), prog->hdr[pos]);
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 10/30] nvir/nir: parse NIR shader info

2018-01-07 Thread Karol Herbst
v2: parse a few more fields
v3: add special handling for GL_ISOLINES

Signed-off-by: Karol Herbst 
---
 .../drivers/nouveau/codegen/nv50_ir_from_nir.cpp   | 58 ++
 1 file changed, 58 insertions(+)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
index 9f581cf3c9..44b208d3f8 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
@@ -68,6 +68,7 @@ public:
bool centroid,
unsigned semantics);
bool assignSlots();
+   bool parseNIR();
 
bool run();
 
@@ -825,6 +826,58 @@ bool Converter::assignSlots() {
return info->assignSlots(info) == 0;
 }
 
+bool
+Converter::parseNIR()
+{
+   info->io.clipDistances = nir->info.clip_distance_array_size;
+   info->io.cullDistances = nir->info.cull_distance_array_size;
+
+   switch(prog->getType()) {
+   case Program::TYPE_COMPUTE:
+  info->prop.cp.numThreads[0] = nir->info.cs.local_size[0];
+  info->prop.cp.numThreads[1] = nir->info.cs.local_size[1];
+  info->prop.cp.numThreads[2] = nir->info.cs.local_size[2];
+  info->bin.smemSize = nir->info.cs.shared_size;
+  break;
+   case Program::TYPE_FRAGMENT:
+  info->prop.fp.earlyFragTests = nir->info.fs.early_fragment_tests;
+  info->prop.fp.persampleInvocation =
+ (nir->info.system_values_read & SYSTEM_BIT_SAMPLE_ID) ||
+ (nir->info.system_values_read & SYSTEM_BIT_SAMPLE_POS);
+  info->prop.fp.postDepthCoverage = nir->info.fs.post_depth_coverage;
+  info->prop.fp.usesDiscard = nir->info.fs.uses_discard;
+  info->prop.fp.usesSampleMaskIn = !!(nir->info.system_values_read & 
SYSTEM_BIT_SAMPLE_MASK_IN);
+  break;
+   case Program::TYPE_GEOMETRY:
+  info->prop.gp.inputPrim = nir->info.gs.input_primitive;
+  info->prop.gp.instanceCount = nir->info.gs.invocations;
+  info->prop.gp.maxVertices = nir->info.gs.vertices_out;
+  info->prop.gp.outputPrim = nir->info.gs.output_primitive;
+  break;
+   case Program::TYPE_TESSELLATION_CONTROL:
+   case Program::TYPE_TESSELLATION_EVAL:
+  if (nir->info.tess.primitive_mode == GL_ISOLINES)
+ info->prop.tp.domain = GL_LINES;
+  else
+ info->prop.tp.domain = nir->info.tess.primitive_mode;
+  info->prop.tp.outputPatchSize = nir->info.tess.tcs_vertices_out;
+  info->prop.tp.outputPrim = nir->info.tess.point_mode ? PIPE_PRIM_POINTS 
: PIPE_PRIM_TRIANGLES;
+  info->prop.tp.partitioning = (nir->info.tess.spacing + 1) % 3;
+  info->prop.tp.winding = !nir->info.tess.ccw;
+  break;
+   case Program::TYPE_VERTEX:
+  info->prop.vp.usesDrawParameters =
+ (nir->info.system_values_read & 
BITFIELD64_BIT(SYSTEM_VALUE_BASE_VERTEX)) ||
+ (nir->info.system_values_read & 
BITFIELD64_BIT(SYSTEM_VALUE_BASE_INSTANCE)) ||
+ (nir->info.system_values_read & BITFIELD64_BIT(SYSTEM_VALUE_DRAW_ID));
+  break;
+   default:
+  break;
+   }
+
+   return true;
+}
+
 bool
 Converter::run()
 {
@@ -862,6 +915,11 @@ Converter::run()
if (prog->dbgFlags & NV50_IR_DEBUG_BASIC)
   nir_print_shader(nir, stderr);
 
+   if (!parseNIR()) {
+  ERROR("Couldn't prase NIR!\n");
+  return false;
+   }
+
if (!assignSlots()) {
   ERROR("Couldn't assign slots!\n");
   return false;
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 07/30] nvir/nir: track defs and provide easy access functions

2018-01-07 Thread Karol Herbst
v2: add helper function for indirects

Signed-off-by: Karol Herbst 
---
 .../drivers/nouveau/codegen/nv50_ir_from_nir.cpp   | 125 +
 1 file changed, 125 insertions(+)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
index f67d24cc4b..c0e733a67d 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
@@ -28,6 +28,9 @@
 #include "codegen/nv50_ir_from_common.h"
 #include "codegen/nv50_ir_util.h"
 
+#include 
+#include 
+
 static int
 type_size(const struct glsl_type *type)
 {
@@ -41,17 +44,139 @@ using namespace nv50_ir;
 class Converter : public ConverterCommon
 {
 public:
+   typedef std::vector LValues;
+   typedef decltype(nir_ssa_def().index) NirSSADefIdx;
+   typedef std::unordered_map NirDefMap;
+
Converter(Program *, nir_shader *, nv50_ir_prog_info *);
 
+   LValues& convert(nir_alu_dest *);
+   LValues& convert(nir_dest *);
+   LValues& convert(nir_register *);
+   LValues& convert(nir_ssa_def *);
+
+   // nir_alu_src needs special handling due to neg and abs modifiers
+   Value* getSrc(nir_alu_src *, uint8_t component = 0);
+   Value* getSrc(nir_register *, uint8_t);
+   Value* getSrc(nir_src *, uint8_t, bool indirect = false);
+   Value* getSrc(nir_ssa_def *, uint8_t);
+   uint32_t getIndirect(nir_src *, uint8_t, Value**);
+
bool run();
 private:
nir_shader *nir;
+
+   NirDefMap ssaDefs;
+   NirDefMap regDefs;
 };
 
 Converter::Converter(Program *prog, nir_shader *nir, nv50_ir_prog_info *info)
: ConverterCommon(prog, info),
  nir(nir) {}
 
+Converter::LValues&
+Converter::convert(nir_dest *dest)
+{
+   if (dest->is_ssa)
+  return convert(>ssa);
+   if (dest->reg.indirect) {
+  ERROR("no support for indirects.");
+  assert(false);
+   }
+   return convert(dest->reg.reg);
+}
+
+Converter::LValues&
+Converter::convert(nir_register *reg)
+{
+   NirDefMap::iterator it = regDefs.find(reg->index);
+   if (it != regDefs.end())
+  return (*it).second;
+
+   LValues newDef(reg->num_components);
+   for (auto i = 0u; i < reg->num_components; i++)
+  newDef[i] = getScratch(reg->bit_size / 8);
+   return regDefs[reg->index] = newDef;
+}
+
+Converter::LValues&
+Converter::convert(nir_ssa_def *def)
+{
+   NirDefMap::iterator it = ssaDefs.find(def->index);
+   if (it != ssaDefs.end())
+  return (*it).second;
+
+   LValues newDef(def->num_components);
+   for (auto i = 0; i < def->num_components; i++)
+  newDef[i] = getScratch(def->bit_size / 8);
+   return ssaDefs[def->index] = newDef;
+}
+
+Value*
+Converter::getSrc(nir_alu_src *src, uint8_t component)
+{
+   if (src->abs || src->negate) {
+  ERROR("modifiers currently not supported on nir_alu_src\n");
+  assert(false);
+   }
+   return getSrc(>src, src->swizzle[component]);
+}
+
+Value*
+Converter::getSrc(nir_register *reg, uint8_t idx)
+{
+   NirDefMap::iterator it = regDefs.find(reg->index);
+   if (it == regDefs.end()) {
+  ERROR("Register %u not found\n", reg->index);
+  assert(false);
+  return nullptr;
+   }
+   return (*it).second[idx];
+}
+
+Value*
+Converter::getSrc(nir_src *src, uint8_t idx, bool indirect)
+{
+   if (src->is_ssa)
+  return getSrc(src->ssa, idx);
+
+   if (src->reg.indirect) {
+  if (indirect)
+ return getSrc(src->reg.indirect, idx);
+  ERROR("no support for indirects.");
+  assert(false);
+  return nullptr;
+   }
+
+   return getSrc(src->reg.reg, idx);
+}
+
+Value*
+Converter::getSrc(nir_ssa_def *src, uint8_t idx)
+{
+   NirDefMap::iterator it = ssaDefs.find(src->index);
+   if (it == ssaDefs.end()) {
+  ERROR("SSA value %u not found\n", src->index);
+  assert(false);
+  return nullptr;
+   }
+   return (*it).second[idx];
+}
+
+uint32_t
+Converter::getIndirect(nir_src *src, uint8_t idx, Value **indirect)
+{
+   nir_const_value *offset = nir_src_as_const_value(*src);
+
+   if (offset) {
+  *indirect = nullptr;
+  return offset->u32[0];
+   }
+
+   *indirect = getSrc(src, idx, true);
+   return 0;
+}
+
 bool
 Converter::run()
 {
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 00/30] Nir support for Nouveau

2018-01-07 Thread Karol Herbst
significant changes to last series:
* disable support for 64 bit types
* fix tessellation shader bugs
* assume vec4 elements for variable index arrays (MemoryOpts workaround)

piglit run -x glx -x egl -x streaming-texture-leak -x max-texture-size 
tests/gpu.py:
[26010/26010] skip: 10410, pass: 15386, warn: 9, fail: 191, crash: 14

remaining issues:
* transform feedback with geometry shaders
* indirects in image_load/store
* interpolateAt
* getting 64 bit types to work. This is mainly limited by codegen RA being
  not able to handle those correctly, because from_TGSI just generates merge
  and splits and doesn't hit the faulty paths.

Karol Herbst (30):
  nir: fix st_nir_assign_var_locations for patch variables
  nvir: print the shader type when dumping headers
  nvir: move common converter code in base class
  nvc0: add support for NIR
  nvc0/debug: add env var to make nir default
  nvir/nir: run some passes to make the conversion easier
  nvir/nir: track defs and provide easy access functions
  nvir/nir: add nir type helper functions
  nvir/nir: run assignSlots
  nvir/nir: parse NIR shader info
  nvir/nir: implement CFG handling
  nvir/nir: implement nir_load_const_instr
  nvir/nir: add skeleton for nir_intrinsic_instr
  nvir/nir: implement nir_alu_instr handling
  nvir/nir: implement nir_intrinsic_load_uniform
  nvir/nir: implement nir_intrinsic_store_(per_vertex_)output
  nvir/nir: implement nir_intrinsic_load_input
  nvir/nir: implement intrinsic_discard(_if)
  nvir/nir: implement loading system values
  nvir/nir: implement nir_ssa_undef_instr
  nvir/nir: implement nir_instr_type_tex
  nvir/nir: add getOperation for intrinsics
  nvir/nir: implement vote and ballot
  nvir/nir: implement variable indexing
  nvir/nir: implement geometry shader nir_intrinsics
  nvir/nir: implement nir_intrinsic_load_ubo
  nvir/nir: implement ssbo intrinsics
  nvir/nir: implement images
  nvir/nir: add memory barriers
  nvir/nir: implement load_per_vertex_output

 src/gallium/drivers/nouveau/Makefile.sources   |3 +
 src/gallium/drivers/nouveau/codegen/nv50_ir.cpp|3 +
 src/gallium/drivers/nouveau/codegen/nv50_ir.h  |1 +
 .../nouveau/codegen/nv50_ir_from_common.cpp|  107 +
 .../drivers/nouveau/codegen/nv50_ir_from_common.h  |   58 +
 .../drivers/nouveau/codegen/nv50_ir_from_nir.cpp   | 2716 
 .../drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp  |  106 +-
 src/gallium/drivers/nouveau/meson.build|   12 +-
 src/gallium/drivers/nouveau/nouveau_screen.c   |4 +
 src/gallium/drivers/nouveau/nouveau_screen.h   |2 +
 src/gallium/drivers/nouveau/nvc0/nvc0_program.c|   19 +-
 src/gallium/drivers/nouveau/nvc0/nvc0_screen.c |   57 +-
 src/gallium/drivers/nouveau/nvc0/nvc0_state.c  |   27 +-
 src/mesa/state_tracker/st_glsl_to_nir.cpp  |8 +-
 14 files changed, 3003 insertions(+), 120 deletions(-)
 create mode 100644 src/gallium/drivers/nouveau/codegen/nv50_ir_from_common.cpp
 create mode 100644 src/gallium/drivers/nouveau/codegen/nv50_ir_from_common.h
 create mode 100644 src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp

-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 03/30] nvir: move common converter code in base class

2018-01-07 Thread Karol Herbst
v2: remove TGSI related bits

Signed-off-by: Karol Herbst 
---
 src/gallium/drivers/nouveau/Makefile.sources   |   2 +
 .../nouveau/codegen/nv50_ir_from_common.cpp| 107 +
 .../drivers/nouveau/codegen/nv50_ir_from_common.h  |  58 +++
 .../drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp  | 106 +---
 src/gallium/drivers/nouveau/meson.build|   2 +
 5 files changed, 172 insertions(+), 103 deletions(-)
 create mode 100644 src/gallium/drivers/nouveau/codegen/nv50_ir_from_common.cpp
 create mode 100644 src/gallium/drivers/nouveau/codegen/nv50_ir_from_common.h

diff --git a/src/gallium/drivers/nouveau/Makefile.sources 
b/src/gallium/drivers/nouveau/Makefile.sources
index 65f08c7d8d..fee5e59522 100644
--- a/src/gallium/drivers/nouveau/Makefile.sources
+++ b/src/gallium/drivers/nouveau/Makefile.sources
@@ -115,6 +115,8 @@ NV50_CODEGEN_SOURCES := \
codegen/nv50_ir_build_util.h \
codegen/nv50_ir_driver.h \
codegen/nv50_ir_emit_nv50.cpp \
+   codegen/nv50_ir_from_common.cpp \
+   codegen/nv50_ir_from_common.h \
codegen/nv50_ir_from_tgsi.cpp \
codegen/nv50_ir_graph.cpp \
codegen/nv50_ir_graph.h \
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_common.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_common.cpp
new file mode 100644
index 00..58e9ab311b
--- /dev/null
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_common.cpp
@@ -0,0 +1,107 @@
+/*
+ * Copyright 2011 Christoph Bumiller
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ */
+
+#include "codegen/nv50_ir_from_common.h"
+
+namespace nv50_ir {
+
+ConverterCommon::ConverterCommon(Program *prog, nv50_ir_prog_info *info)
+   :  BuildUtil(prog),
+  info(info) {}
+
+ConverterCommon::Subroutine *
+ConverterCommon::getSubroutine(unsigned ip)
+{
+   std::map::iterator it = sub.map.find(ip);
+
+   if (it == sub.map.end())
+  it = sub.map.insert(std::make_pair(
+  ip, Subroutine(new Function(prog, "SUB", ip.first;
+
+   return >second;
+}
+
+ConverterCommon::Subroutine *
+ConverterCommon::getSubroutine(Function *f)
+{
+   unsigned ip = f->getLabel();
+   std::map::iterator it = sub.map.find(ip);
+
+   if (it == sub.map.end())
+  it = sub.map.insert(std::make_pair(ip, Subroutine(f))).first;
+
+   return >second;
+}
+
+uint8_t
+ConverterCommon::translateInterpMode(const nv50_ir_varying *var, operation& op)
+{
+   uint8_t mode = NV50_IR_INTERP_PERSPECTIVE;
+
+   if (var->flat)
+  mode = NV50_IR_INTERP_FLAT;
+   else
+   if (var->linear)
+  mode = NV50_IR_INTERP_LINEAR;
+   else
+   if (var->sc)
+  mode = NV50_IR_INTERP_SC;
+
+   op = (mode == NV50_IR_INTERP_PERSPECTIVE || mode == NV50_IR_INTERP_SC)
+  ? OP_PINTERP : OP_LINTERP;
+
+   if (var->centroid)
+  mode |= NV50_IR_INTERP_CENTROID;
+
+   return mode;
+}
+
+void
+ConverterCommon::handleUserClipPlanes()
+{
+   Value *res[8];
+   int n, i, c;
+
+   for (c = 0; c < 4; ++c) {
+  for (i = 0; i < info->io.genUserClip; ++i) {
+ Symbol *sym = mkSymbol(FILE_MEMORY_CONST, info->io.auxCBSlot,
+TYPE_F32, info->io.ucpBase + i * 16 + c * 4);
+ Value *ucp = mkLoadv(TYPE_F32, sym, NULL);
+ if (c == 0)
+res[i] = mkOp2v(OP_MUL, TYPE_F32, getScratch(), clipVtx[c], ucp);
+ else
+mkOp3(OP_MAD, TYPE_F32, res[i], clipVtx[c], ucp, res[i]);
+  }
+   }
+
+   const int first = info->numOutputs - (info->io.genUserClip + 3) / 4;
+
+   for (i = 0; i < info->io.genUserClip; ++i) {
+  n = i / 4 + first;
+  c = i % 4;
+  Symbol *sym =
+ mkSymbol(FILE_SHADER_OUTPUT, 0, TYPE_F32, info->out[n].slot[c] * 4);
+  mkStore(OP_EXPORT, TYPE_F32, sym, NULL, res[i]);
+   }
+}
+
+} // nv50_ir
diff --git 

[Mesa-dev] [PATCH v3 06/30] nvir/nir: run some passes to make the conversion easier

2018-01-07 Thread Karol Herbst
v2: add constant_folding

Signed-off-by: Karol Herbst 
---
 .../drivers/nouveau/codegen/nv50_ir_from_nir.cpp   | 40 ++
 1 file changed, 40 insertions(+)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
index 6bccd14bce..f67d24cc4b 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
@@ -28,6 +28,12 @@
 #include "codegen/nv50_ir_from_common.h"
 #include "codegen/nv50_ir_util.h"
 
+static int
+type_size(const struct glsl_type *type)
+{
+   return glsl_count_attribute_slots(type, false);
+}
+
 namespace {
 
 using namespace nv50_ir;
@@ -49,6 +55,40 @@ Converter::Converter(Program *prog, nir_shader *nir, 
nv50_ir_prog_info *info)
 bool
 Converter::run()
 {
+   bool progress;
+
+   if (prog->dbgFlags & NV50_IR_DEBUG_BASIC)
+  nir_print_shader(nir, stderr);
+
+   // converts intrinsic load_var to intrinsic load_uniform
+   NIR_PASS_V(nir, nir_lower_io, nir_var_all, type_size, 
(nir_lower_io_options)0);
+
+   NIR_PASS_V(nir, nir_lower_regs_to_ssa);
+   NIR_PASS_V(nir, nir_lower_load_const_to_scalar);
+
+   do {
+  progress = false;
+  // we need this to_ssa otherwise the later opts are less effective
+  NIR_PASS_V(nir, nir_lower_vars_to_ssa);
+  NIR_PASS(progress, nir, nir_lower_alu_to_scalar);
+  NIR_PASS(progress, nir, nir_lower_phis_to_scalar);
+  // some ops depend on having constant as sources, but those can also
+  // point to expressions made from constants like 0 + 1
+  NIR_PASS(progress, nir, nir_opt_constant_folding);
+  NIR_PASS(progress, nir, nir_copy_prop);
+  NIR_PASS(progress, nir, nir_opt_dce);
+  NIR_PASS(progress, nir, nir_opt_dead_cf);
+   } while (progress);
+
+   NIR_PASS_V(nir, nir_remove_dead_variables, nir_var_local);
+   NIR_PASS_V(nir, nir_convert_from_ssa, true);
+
+   /* Garbage collect dead instructions */
+   nir_sweep(nir);
+
+   if (prog->dbgFlags & NV50_IR_DEBUG_BASIC)
+  nir_print_shader(nir, stderr);
+
return false;
 }
 
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 05/30] nvc0/debug: add env var to make nir default

2018-01-07 Thread Karol Herbst
v2: allow for non debug builds as well
v3: move reading out env var more global
disable tg4 with multiple offsets with nir
disable caps for 64 bit types

Signed-off-by: Karol Herbst 
---
 src/gallium/drivers/nouveau/nouveau_screen.c   |  4 
 src/gallium/drivers/nouveau/nouveau_screen.h   |  2 ++
 src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 16 ++--
 3 files changed, 16 insertions(+), 6 deletions(-)

diff --git a/src/gallium/drivers/nouveau/nouveau_screen.c 
b/src/gallium/drivers/nouveau/nouveau_screen.c
index c144b39b2d..6c52f9e40c 100644
--- a/src/gallium/drivers/nouveau/nouveau_screen.c
+++ b/src/gallium/drivers/nouveau/nouveau_screen.c
@@ -175,6 +175,7 @@ nouveau_screen_init(struct nouveau_screen *screen, struct 
nouveau_device *dev)
void *data;
union nouveau_bo_config mm_config;
 
+   char *use_nir = getenv("NV50_PROG_USE_NIR");
char *nv_dbg = getenv("NOUVEAU_MESA_DEBUG");
if (nv_dbg)
   nouveau_mesa_debug = atoi(nv_dbg);
@@ -261,6 +262,9 @@ nouveau_screen_init(struct nouveau_screen *screen, struct 
nouveau_device *dev)
NOUVEAU_BO_GART | NOUVEAU_BO_MAP,
_config);
screen->mm_VRAM = nouveau_mm_create(dev, NOUVEAU_BO_VRAM, _config);
+
+   screen->prefer_nir = use_nir && strtol(use_nir, NULL, 0) == 1;
+
return 0;
 }
 
diff --git a/src/gallium/drivers/nouveau/nouveau_screen.h 
b/src/gallium/drivers/nouveau/nouveau_screen.h
index e4fbae99ca..1229b66b26 100644
--- a/src/gallium/drivers/nouveau/nouveau_screen.h
+++ b/src/gallium/drivers/nouveau/nouveau_screen.h
@@ -62,6 +62,8 @@ struct nouveau_screen {
 
struct disk_cache *disk_shader_cache;
 
+   bool prefer_nir;
+
 #ifdef NOUVEAU_ENABLE_DRIVER_STATISTICS
union {
   uint64_t v[29];
diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
index e88171bd61..22d40c7a02 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
@@ -112,7 +112,8 @@ static int
 nvc0_screen_get_param(struct pipe_screen *pscreen, enum pipe_cap param)
 {
const uint16_t class_3d = nouveau_screen(pscreen)->class_3d;
-   struct nouveau_device *dev = nouveau_screen(pscreen)->device;
+   const struct nouveau_screen *screen = nouveau_screen(pscreen);
+   struct nouveau_device *dev = screen->device;
 
switch (param) {
/* non-boolean caps */
@@ -219,7 +220,6 @@ nvc0_screen_get_param(struct pipe_screen *pscreen, enum 
pipe_cap param)
case PIPE_CAP_USER_VERTEX_BUFFERS:
case PIPE_CAP_TEXTURE_QUERY_LOD:
case PIPE_CAP_SAMPLE_SHADING:
-   case PIPE_CAP_TEXTURE_GATHER_OFFSETS:
case PIPE_CAP_TEXTURE_GATHER_SM5:
case PIPE_CAP_TGSI_FS_FINE_DERIVATIVE:
case PIPE_CAP_CONDITIONAL_RENDER_INVERTED:
@@ -251,14 +251,17 @@ nvc0_screen_get_param(struct pipe_screen *pscreen, enum 
pipe_cap param)
case PIPE_CAP_POLYGON_OFFSET_UNITS_UNSCALED:
case PIPE_CAP_TGSI_ARRAY_COMPONENTS:
case PIPE_CAP_TGSI_MUL_ZERO_WINS:
-   case PIPE_CAP_DOUBLES:
-   case PIPE_CAP_INT64:
case PIPE_CAP_TGSI_TEX_TXF_LZ:
case PIPE_CAP_TGSI_CLOCK:
case PIPE_CAP_COMPUTE:
case PIPE_CAP_CAN_BIND_CONST_BUFFER_AS_VERTEX:
case PIPE_CAP_ALLOW_MAPPED_BUFFERS_DURING_EXECUTION:
   return 1;
+   case PIPE_CAP_DOUBLES:
+   case PIPE_CAP_INT64:
+   case PIPE_CAP_TEXTURE_GATHER_OFFSETS:
+  /* TODO: nir doesn't support tg4 with multiple offsets */
+  return screen->prefer_nir ? 0 : 1;
case PIPE_CAP_PREFER_BLIT_BASED_TEXTURE_TRANSFER:
   return nouveau_screen(pscreen)->vram_domain & NOUVEAU_BO_VRAM ? 1 : 0;
case PIPE_CAP_TGSI_FS_FBFETCH:
@@ -340,7 +343,8 @@ nvc0_screen_get_shader_param(struct pipe_screen *pscreen,
  enum pipe_shader_type shader,
  enum pipe_shader_cap param)
 {
-   const uint16_t class_3d = nouveau_screen(pscreen)->class_3d;
+   const struct nouveau_screen *screen = nouveau_screen(pscreen);
+   const uint16_t class_3d = screen->class_3d;
 
switch (shader) {
case PIPE_SHADER_VERTEX:
@@ -356,7 +360,7 @@ nvc0_screen_get_shader_param(struct pipe_screen *pscreen,
 
switch (param) {
case PIPE_SHADER_CAP_PREFERRED_IR:
-  return PIPE_SHADER_IR_TGSI;
+  return screen->prefer_nir ? PIPE_SHADER_IR_NIR : PIPE_SHADER_IR_TGSI;
case PIPE_SHADER_CAP_SUPPORTED_IRS:
   return 1 << PIPE_SHADER_IR_TGSI |
  1 << PIPE_SHADER_IR_NIR;
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 08/30] nvir/nir: add nir type helper functions

2018-01-07 Thread Karol Herbst
Signed-off-by: Karol Herbst 
---
 .../drivers/nouveau/codegen/nv50_ir_from_nir.cpp   | 116 +
 1 file changed, 116 insertions(+)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
index c0e733a67d..b0de3b7d64 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
@@ -46,6 +46,7 @@ class Converter : public ConverterCommon
 public:
typedef std::vector LValues;
typedef decltype(nir_ssa_def().index) NirSSADefIdx;
+   typedef decltype(nir_ssa_def().bit_size) NirSSADefBitSize;
typedef std::unordered_map NirDefMap;
 
Converter(Program *, nir_shader *, nv50_ir_prog_info *);
@@ -63,6 +64,17 @@ public:
uint32_t getIndirect(nir_src *, uint8_t, Value**);
 
bool run();
+
+   bool isFloatType(nir_alu_type);
+   bool isSignedType(nir_alu_type);
+   bool isResultFloat(nir_op);
+   bool isResultSigned(nir_op);
+   DataType getDType(nir_alu_instr*);
+   DataType getDType(nir_intrinsic_instr*);
+   DataType getDType(nir_op, NirSSADefBitSize);
+   std::vector getSTypes(nir_alu_instr*);
+   DataType getSType(nir_src&, bool isFloat, bool isSigned);
+
 private:
nir_shader *nir;
 
@@ -74,6 +86,110 @@ Converter::Converter(Program *prog, nir_shader *nir, 
nv50_ir_prog_info *info)
: ConverterCommon(prog, info),
  nir(nir) {}
 
+bool
+Converter::isFloatType(nir_alu_type type)
+{
+   return !!(nir_alu_type_get_base_type(type) == nir_type_float);
+}
+
+bool
+Converter::isSignedType(nir_alu_type type)
+{
+   return !!(nir_alu_type_get_base_type(type) == nir_type_int);
+}
+
+bool
+Converter::isResultFloat(nir_op op)
+{
+   const nir_op_info  = nir_op_infos[op];
+   if (info.output_type != nir_type_invalid)
+  return isFloatType(info.output_type);
+
+   switch (op) {
+   default:
+  ERROR("isResultFloat not implemented for %s\n", nir_op_infos[op].name);
+  assert(false);
+  return true;
+   }
+}
+
+bool
+Converter::isResultSigned(nir_op op)
+{
+   const nir_op_info  = nir_op_infos[op];
+   if (info.output_type != nir_type_invalid)
+  return isSignedType(info.output_type);
+
+   switch (op) {
+   default:
+  ERROR("isResultSigned not implemented for %s\n", nir_op_infos[op].name);
+  assert(false);
+  return true;
+   }
+}
+
+DataType
+Converter::getDType(nir_alu_instr *insn)
+{
+   if (insn->dest.dest.is_ssa)
+  return getDType(insn->op, insn->dest.dest.ssa.bit_size);
+   else
+  return getDType(insn->op, insn->dest.dest.reg.reg->bit_size);
+}
+
+DataType
+Converter::getDType(nir_intrinsic_instr *insn)
+{
+   if (insn->dest.is_ssa)
+  return typeOfSize(insn->dest.ssa.bit_size / 8, false, false);
+   else
+  return typeOfSize(insn->dest.reg.reg->bit_size / 8, false, false);
+}
+
+DataType
+Converter::getDType(nir_op op, Converter::NirSSADefBitSize bitSize)
+{
+   DataType ty = typeOfSize(bitSize / 8, isResultFloat(op), 
isResultSigned(op));
+   if (ty == TYPE_NONE) {
+  ERROR("couldn't get Type for op %s with bitSize %u\n", 
nir_op_infos[op].name, bitSize);
+  assert(false);
+   }
+   return ty;
+}
+
+std::vector
+Converter::getSTypes(nir_alu_instr *insn)
+{
+   const nir_op_info  = nir_op_infos[insn->op];
+   std::vector res(info.num_inputs);
+
+   for (auto i = 0u; i < info.num_inputs; ++i) {
+  if (info.input_types[i] != nir_type_invalid)
+ res[i] = getSType(insn->src[i].src, isFloatType(info.input_types[i]), 
isSignedType(info.input_types[i]));
+  else switch (insn->op) {
+ default:
+ERROR("getSType not implemented for %s idx %u\n", info.name, i);
+assert(false);
+res[i] = TYPE_NONE;
+break;
+  }
+   }
+
+   return res;
+}
+
+DataType
+Converter::getSType(nir_src , bool isFloat, bool isSigned)
+{
+   NirSSADefBitSize bitSize;
+   if (src.is_ssa)
+  bitSize = src.ssa->bit_size;
+   else
+  bitSize = src.reg.reg->bit_size;
+
+   return typeOfSize(bitSize / 8, isFloat, isSigned);
+}
+
 Converter::LValues&
 Converter::convert(nir_dest *dest)
 {
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 04/30] nvc0: add support for NIR

2018-01-07 Thread Karol Herbst
not all those nir options are actually required, it just made the work a
little easier.

v2: fix asserts
parse compute shaders
don't lower bitfield_insert
v3: fix memory leak

Signed-off-by: Karol Herbst 
---
 src/gallium/drivers/nouveau/Makefile.sources   |  1 +
 src/gallium/drivers/nouveau/codegen/nv50_ir.cpp|  3 +
 src/gallium/drivers/nouveau/codegen/nv50_ir.h  |  1 +
 .../drivers/nouveau/codegen/nv50_ir_from_nir.cpp   | 69 ++
 src/gallium/drivers/nouveau/meson.build| 10 ++--
 src/gallium/drivers/nouveau/nvc0/nvc0_program.c| 18 +-
 src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 41 -
 src/gallium/drivers/nouveau/nvc0/nvc0_state.c  | 27 -
 8 files changed, 161 insertions(+), 9 deletions(-)
 create mode 100644 src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp

diff --git a/src/gallium/drivers/nouveau/Makefile.sources 
b/src/gallium/drivers/nouveau/Makefile.sources
index fee5e59522..e31413a2f3 100644
--- a/src/gallium/drivers/nouveau/Makefile.sources
+++ b/src/gallium/drivers/nouveau/Makefile.sources
@@ -117,6 +117,7 @@ NV50_CODEGEN_SOURCES := \
codegen/nv50_ir_emit_nv50.cpp \
codegen/nv50_ir_from_common.cpp \
codegen/nv50_ir_from_common.h \
+   codegen/nv50_ir_from_nir.cpp \
codegen/nv50_ir_from_tgsi.cpp \
codegen/nv50_ir_graph.cpp \
codegen/nv50_ir_graph.h \
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir.cpp
index 6f12df70a1..b95ba8e4e9 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir.cpp
@@ -1231,6 +1231,9 @@ nv50_ir_generate_code(struct nv50_ir_prog_info *info)
prog->optLevel = info->optLevel;
 
switch (info->bin.sourceRep) {
+   case PIPE_SHADER_IR_NIR:
+  ret = prog->makeFromNIR(info) ? 0 : -2;
+  break;
case PIPE_SHADER_IR_TGSI:
   ret = prog->makeFromTGSI(info) ? 0 : -2;
   break;
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir.h 
b/src/gallium/drivers/nouveau/codegen/nv50_ir.h
index f4f3c70888..e5b4592a61 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir.h
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir.h
@@ -1255,6 +1255,7 @@ public:
inline void del(Function *fn, int& id) { allFuncs.remove(id); }
inline void add(Value *rval, int& id) { allRValues.insert(rval, id); }
 
+   bool makeFromNIR(struct nv50_ir_prog_info *);
bool makeFromTGSI(struct nv50_ir_prog_info *);
bool convertToSSA();
bool optimizeSSA(int level);
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
new file mode 100644
index 00..6bccd14bce
--- /dev/null
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
@@ -0,0 +1,69 @@
+/*
+ * Copyright 2017 Red Hat Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ * Authors: Karol Herbst 
+ */
+
+#include "compiler/nir/nir.h"
+
+#include "codegen/nv50_ir.h"
+#include "codegen/nv50_ir_from_common.h"
+#include "codegen/nv50_ir_util.h"
+
+namespace {
+
+using namespace nv50_ir;
+
+class Converter : public ConverterCommon
+{
+public:
+   Converter(Program *, nir_shader *, nv50_ir_prog_info *);
+
+   bool run();
+private:
+   nir_shader *nir;
+};
+
+Converter::Converter(Program *prog, nir_shader *nir, nv50_ir_prog_info *info)
+   : ConverterCommon(prog, info),
+ nir(nir) {}
+
+bool
+Converter::run()
+{
+   return false;
+}
+
+} // unnamed namespace
+
+namespace nv50_ir {
+
+bool
+Program::makeFromNIR(struct nv50_ir_prog_info *info)
+{
+   nir_shader *nir = (nir_shader*)info->bin.source;
+   Converter converter(this, nir, info);
+   bool result = converter.run();
+   tlsSize = info->bin.tlsSpace;
+   return result;
+}
+
+} // namespace nv50_ir
diff --git a/src/gallium/drivers/nouveau/meson.build 

[Mesa-dev] [PATCH v3 01/30] nir: fix st_nir_assign_var_locations for patch variables

2018-01-07 Thread Karol Herbst
Signed-off-by: Karol Herbst 
Reviewed-by: Kenneth Graunke 
Reviewed-by: Timothy Arceri 
---
 src/mesa/state_tracker/st_glsl_to_nir.cpp | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/src/mesa/state_tracker/st_glsl_to_nir.cpp 
b/src/mesa/state_tracker/st_glsl_to_nir.cpp
index 5683df..1c5de3d5de 100644
--- a/src/mesa/state_tracker/st_glsl_to_nir.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_nir.cpp
@@ -139,8 +139,12 @@ st_nir_assign_var_locations(struct exec_list *var_list, 
unsigned *size,
   }
 
   bool processed = false;
-  if (var->data.patch) {
- unsigned patch_loc = var->data.location - VARYING_SLOT_VAR0;
+  if (var->data.patch &&
+  var->data.location != VARYING_SLOT_TESS_LEVEL_INNER &&
+  var->data.location != VARYING_SLOT_TESS_LEVEL_OUTER &&
+  var->data.location != VARYING_SLOT_BOUNDING_BOX0 &&
+  var->data.location != VARYING_SLOT_BOUNDING_BOX1) {
+ unsigned patch_loc = var->data.location - VARYING_SLOT_PATCH0;
  if (processed_patch_locs & (1 << patch_loc))
 processed = true;
 
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 09/30] nvir/nir: run assignSlots

2018-01-07 Thread Karol Herbst
v2: add support for geometry shaders
set idx
add some missing mappings
fix for 64bit inputs/outputs
fix up some FP color output index messup
parse centroid flag
v3: fix arrays in outputs as well
fix input/ouput size calculation for tessellation shaders

Signed-off-by: Karol Herbst 
---
 .../drivers/nouveau/codegen/nv50_ir_from_nir.cpp   | 537 +
 1 file changed, 537 insertions(+)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
index b0de3b7d64..9f581cf3c9 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
@@ -63,6 +63,12 @@ public:
Value* getSrc(nir_ssa_def *, uint8_t);
uint32_t getIndirect(nir_src *, uint8_t, Value**);
 
+   void setInterpolate(nv50_ir_varying *,
+   decltype(nir_variable().data.interpolation),
+   bool centroid,
+   unsigned semantics);
+   bool assignSlots();
+
bool run();
 
bool isFloatType(nir_alu_type);
@@ -293,6 +299,532 @@ Converter::getIndirect(nir_src *src, uint8_t idx, Value 
**indirect)
return 0;
 }
 
+static void
+vert_attrib_to_tgsi_semantic(unsigned slot, unsigned *name, unsigned *index)
+{
+   if (slot >= VERT_ATTRIB_GENERIC0) {
+  *name = TGSI_SEMANTIC_GENERIC;
+  *index = slot - VERT_ATTRIB_GENERIC0;
+  return;
+   }
+
+   if (slot == VERT_ATTRIB_POINT_SIZE) {
+  ERROR("unknown vert attrib slot %u\n", slot);
+  assert(false);
+  return;
+   }
+
+   if (slot >= VERT_ATTRIB_TEX0) {
+  *name = TGSI_SEMANTIC_TEXCOORD;
+  *index = slot - VERT_ATTRIB_TEX0;
+  return;
+   }
+
+   switch (slot) {
+   case VERT_ATTRIB_COLOR0:
+  *name = TGSI_SEMANTIC_COLOR;
+  *index = 0;
+  break;
+   case VERT_ATTRIB_COLOR1:
+  *name = TGSI_SEMANTIC_COLOR;
+  *index = 1;
+  break;
+   case VERT_ATTRIB_EDGEFLAG:
+  *name = TGSI_SEMANTIC_EDGEFLAG;
+  *index = 0;
+  break;
+   case VERT_ATTRIB_FOG:
+  *name = TGSI_SEMANTIC_FOG;
+  *index = 0;
+  break;
+   case VERT_ATTRIB_NORMAL:
+  *name = TGSI_SEMANTIC_NORMAL;
+  *index = 0;
+  break;
+   case VERT_ATTRIB_POS:
+  *name = TGSI_SEMANTIC_POSITION;
+  *index = 0;
+  break;
+   default:
+  ERROR("unknown vert attrib slot %u\n", slot);
+  assert(false);
+   }
+}
+
+static void
+varying_slot_to_tgsi_semantic(unsigned slot, unsigned *name, unsigned *index)
+{
+   if (slot >= VARYING_SLOT_PATCH0) {
+  *name = TGSI_SEMANTIC_PATCH;
+  *index = slot - VARYING_SLOT_PATCH0;
+  return;
+   }
+
+   if (slot >= VARYING_SLOT_VAR0) {
+  *name = TGSI_SEMANTIC_GENERIC;
+  *index = slot - VARYING_SLOT_VAR0;
+  return;
+   }
+
+   if (slot >= VARYING_SLOT_TEX0 && slot <= VARYING_SLOT_TEX7) {
+  *name = TGSI_SEMANTIC_TEXCOORD;
+  *index = slot - VARYING_SLOT_TEX0;
+  return;
+   }
+
+   switch (slot) {
+   case VARYING_SLOT_BFC0:
+  *name = TGSI_SEMANTIC_BCOLOR;
+  *index = 0;
+  break;
+   case VARYING_SLOT_BFC1:
+  *name = TGSI_SEMANTIC_BCOLOR;
+  *index = 1;
+  break;
+   case VARYING_SLOT_CLIP_DIST0:
+  *name = TGSI_SEMANTIC_CLIPDIST;
+  *index = 0;
+  break;
+   case VARYING_SLOT_CLIP_DIST1:
+  *name = TGSI_SEMANTIC_CLIPDIST;
+  *index = 1;
+  break;
+   case VARYING_SLOT_CLIP_VERTEX:
+  *name = TGSI_SEMANTIC_CLIPVERTEX;
+  *index = 0;
+  break;
+   case VARYING_SLOT_COL0:
+  *name = TGSI_SEMANTIC_COLOR;
+  *index = 0;
+  break;
+   case VARYING_SLOT_COL1:
+  *name = TGSI_SEMANTIC_COLOR;
+  *index = 1;
+  break;
+   case VARYING_SLOT_EDGE:
+  *name = TGSI_SEMANTIC_EDGEFLAG;
+  *index = 0;
+  break;
+   case VARYING_SLOT_FACE:
+  *name = TGSI_SEMANTIC_FACE;
+  *index = 0;
+  break;
+   case VARYING_SLOT_FOGC:
+  *name = TGSI_SEMANTIC_FOG;
+  *index = 0;
+  break;
+   case VARYING_SLOT_LAYER:
+  *name = TGSI_SEMANTIC_LAYER;
+  *index = 0;
+  break;
+   case VARYING_SLOT_PNTC:
+  *name = TGSI_SEMANTIC_PCOORD;
+  *index = 0;
+  break;
+   case VARYING_SLOT_POS:
+  *name = TGSI_SEMANTIC_POSITION;
+  *index = 0;
+  break;
+   case VARYING_SLOT_PRIMITIVE_ID:
+  *name = TGSI_SEMANTIC_PRIMID;
+  *index = 0;
+  break;
+   case VARYING_SLOT_PSIZ:
+  *name = TGSI_SEMANTIC_PSIZE;
+  *index = 0;
+  break;
+   case VARYING_SLOT_TESS_LEVEL_INNER:
+  *name = TGSI_SEMANTIC_TESSINNER;
+  *index = 0;
+  break;
+   case VARYING_SLOT_TESS_LEVEL_OUTER:
+  *name = TGSI_SEMANTIC_TESSOUTER;
+  *index = 0;
+  break;
+   case VARYING_SLOT_VIEWPORT:
+  *name = TGSI_SEMANTIC_VIEWPORT_INDEX;
+  *index = 0;
+  break;
+   default:
+  ERROR("unknown varying slot %u\n", slot);
+  assert(false);
+   }
+}
+
+static void

[Mesa-dev] [PATCH v3 11/30] nvir/nir: implement CFG handling

2018-01-07 Thread Karol Herbst
Signed-off-by: Karol Herbst 
---
 .../drivers/nouveau/codegen/nv50_ir_from_nir.cpp   | 255 -
 1 file changed, 253 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
index 44b208d3f8..0d18dd030a 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
@@ -48,10 +48,12 @@ public:
typedef decltype(nir_ssa_def().index) NirSSADefIdx;
typedef decltype(nir_ssa_def().bit_size) NirSSADefBitSize;
typedef std::unordered_map NirDefMap;
+   typedef std::unordered_map 
NirBlockMap;
 
Converter(Program *, nir_shader *, nv50_ir_prog_info *);
 
LValues& convert(nir_alu_dest *);
+   BasicBlock* convert(nir_block *);
LValues& convert(nir_dest *);
LValues& convert(nir_register *);
LValues& convert(nir_ssa_def *);
@@ -70,6 +72,14 @@ public:
bool assignSlots();
bool parseNIR();
 
+   bool visit(nir_block *);
+   bool visit(nir_cf_node *);
+   bool visit(nir_function *);
+   bool visit(nir_if *);
+   bool visit(nir_instr *);
+   bool visit(nir_jump_instr *);
+   bool visit(nir_loop *);
+
bool run();
 
bool isFloatType(nir_alu_type);
@@ -87,11 +97,34 @@ private:
 
NirDefMap ssaDefs;
NirDefMap regDefs;
+   NirBlockMap blocks;
+   unsigned int curLoopDepth;
+
+   BasicBlock *exit;
+
+   union {
+  struct {
+ Value *position;
+  } fp;
+   };
 };
 
 Converter::Converter(Program *prog, nir_shader *nir, nv50_ir_prog_info *info)
: ConverterCommon(prog, info),
- nir(nir) {}
+ nir(nir),
+ curLoopDepth(0) {}
+
+BasicBlock *
+Converter::convert(nir_block *block)
+{
+   NirBlockMap::iterator it = blocks.find(block->index);
+   if (it != blocks.end())
+  return (*it).second;
+
+   BasicBlock *bb = new BasicBlock(func);
+   blocks[block->index] = bb;
+   return bb;
+}
 
 bool
 Converter::isFloatType(nir_alu_type type)
@@ -878,6 +911,219 @@ Converter::parseNIR()
return true;
 }
 
+bool
+Converter::visit(nir_function *function)
+{
+   // we only support emiting the main function for now
+   assert(!strcmp(function->name, "main"));
+   assert(function->impl);
+
+   // usually the blocks will set everything up, but main is special
+   BasicBlock *entry = new BasicBlock(prog->main);
+   exit = new BasicBlock(prog->main);
+   blocks[nir_start_block(function->impl)->index] = entry;
+   prog->main->setEntry(entry);
+   prog->main->setExit(exit);
+
+   setPosition(entry, true);
+
+   switch (prog->getType()) {
+   case Program::TYPE_TESSELLATION_CONTROL:
+  outBase = mkOp2v(
+ OP_SUB, TYPE_U32, getSSA(),
+ mkOp1v(OP_RDSV, TYPE_U32, getSSA(), mkSysVal(SV_LANEID, 0)),
+ mkOp1v(OP_RDSV, TYPE_U32, getSSA(), mkSysVal(SV_INVOCATION_ID, 0)));
+  break;
+   case Program::TYPE_FRAGMENT: {
+  Symbol *sv = mkSysVal(SV_POSITION, 3);
+  fragCoord[3] = mkOp1v(OP_RDSV, TYPE_F32, getSSA(), sv);
+  fp.position = mkOp1v(OP_RCP, TYPE_F32, fragCoord[3], fragCoord[3]);
+  break;
+   }
+   default:
+  break;
+   }
+
+   nir_index_ssa_defs(function->impl);
+   foreach_list_typed(nir_cf_node, node, node, >impl->body) {
+  if (!visit(node))
+ return false;
+   }
+
+   bb->cfg.attach(>cfg, Graph::Edge::TREE);
+   setPosition(exit, true);
+
+   // TODO: for non main function this needs to be a OP_RETURN
+   mkOp(OP_EXIT, TYPE_NONE, NULL)->terminator = 1;
+   return true;
+}
+
+bool
+Converter::visit(nir_cf_node *node)
+{
+   switch (node->type) {
+   case nir_cf_node_block:
+  if (!visit(nir_cf_node_as_block(node)))
+ return false;
+  break;
+   case nir_cf_node_if:
+  if (!visit(nir_cf_node_as_if(node)))
+ return false;
+  break;
+   case nir_cf_node_loop:
+  if (!visit(nir_cf_node_as_loop(node)))
+ return false;
+  break;
+   default:
+  ERROR("unknown nir_cf_node type %u\n", node->type);
+  return false;
+   }
+   return true;
+}
+
+bool
+Converter::visit(nir_block *block)
+{
+   BasicBlock *bb = convert(block);
+
+   setPosition(bb, true);
+   nir_foreach_instr(insn, block) {
+  if (!visit(insn))
+ return false;
+   }
+   return true;
+}
+
+bool
+Converter::visit(nir_if *nif)
+{
+   DataType sType = getSType(nif->condition, false, false);
+   Value *src = getSrc(>condition, 0);
+
+   nir_block *lastThen = nir_if_last_then_block(nif);
+   nir_block *lastElse = nir_if_last_else_block(nif);
+
+   assert(!lastThen->successors[1]);
+   assert(!lastElse->successors[1]);
+
+   BasicBlock *ifBB = convert(nir_if_first_then_block(nif));
+   BasicBlock *elseBB = convert(nir_if_first_else_block(nif));
+
+   bb->cfg.attach(>cfg, Graph::Edge::TREE);
+   bb->cfg.attach(>cfg, Graph::Edge::TREE);
+
+   // we only insert joinats, if both nodes end up at the end of the if again.
+   

[Mesa-dev] [PATCH v3 12/30] nvir/nir: implement nir_load_const_instr

2018-01-07 Thread Karol Herbst
Signed-off-by: Karol Herbst 
---
 .../drivers/nouveau/codegen/nv50_ir_from_nir.cpp | 20 
 1 file changed, 20 insertions(+)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
index 0d18dd030a..b130fc696b 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
@@ -78,6 +78,7 @@ public:
bool visit(nir_if *);
bool visit(nir_instr *);
bool visit(nir_jump_instr *);
+   bool visit(nir_load_const_instr*);
bool visit(nir_loop *);
 
bool run();
@@ -1086,6 +1087,10 @@ bool
 Converter::visit(nir_instr *insn)
 {
switch (insn->type) {
+   case nir_instr_type_load_const:
+  if (!visit(nir_instr_as_load_const(insn)))
+ return false;
+  break;
case nir_instr_type_jump:
   if (!visit(nir_instr_as_jump(insn)))
  return false;
@@ -1097,6 +1102,21 @@ Converter::visit(nir_instr *insn)
return true;
 }
 
+bool
+Converter::visit(nir_load_const_instr *insn)
+{
+   assert(insn->def.bit_size <= 64);
+
+   LValues  = convert(>def);
+   for (int i = 0; i < insn->def.num_components; i++) {
+  if (insn->def.bit_size > 32)
+ loadImm(newDefs[i], insn->value.u64[i]);
+  else
+ loadImm(newDefs[i], insn->value.u32[i]);
+   }
+   return true;
+}
+
 bool
 Converter::visit(nir_jump_instr *insn)
 {
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 103999] 4x MSAA with RG32F shows garbage on triangle edges

2018-01-07 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=103999

Bas Nieuwenhuizen  changed:

   What|Removed |Added

 Status|NEEDINFO|NEW

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 104529] Add OpenCL information to docs/features.txt

2018-01-07 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=104529

Bug ID: 104529
   Summary: Add OpenCL information to docs/features.txt
   Product: Mesa
   Version: git
  Hardware: All
OS: All
Status: NEW
  Severity: minor
  Priority: medium
 Component: Mesa core
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: ved...@miletic.net
QA Contact: mesa-dev@lists.freedesktop.org

Inspired in part by bug 104478 and a recent discussion on IRC: it would be nice
if docs/features.txt would also contain information about OpenCL.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 104529] Add OpenCL information to docs/features.txt

2018-01-07 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=104529

Vedran Miletić  changed:

   What|Removed |Added

   Assignee|mesa-dev@lists.freedesktop. |ved...@miletic.net
   |org |

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 00/15] RadeonSI 32-bit GPU pointers

2018-01-07 Thread Christian König

Am 07.01.2018 um 17:42 schrieb Marek Olšák:

On Sun, Jan 7, 2018 at 9:50 AM, Christian König
 wrote:

Am 07.01.2018 um 01:48 schrieb Marek Olšák:

On Sat, Jan 6, 2018 at 5:51 PM, Christian König
 wrote:

Hi Marek,

actually I was on the verge to remove the 32bit VM support in libdrm
because
it clashes with HMM and SVM in general.

Is it possible to set the upper 32bit of the 64bit address to some fixed
value instead?

Yes, but not on radeon. radeon only has 8GB of virtual address space and
4GB on older kernels. I would have to change LLVM to set the high bits
differently on amdgpu but keep the high bits 0 on radeon.


That only matters on Vega10/Raven anyway.

But in general being able to define what LLVM uses for the upper bits sounds
like a good idea to me, this way we can keep the handling in Mesa.

Going to send updated libdrm patches which keeps the 32bit range usable on
Vega10/Raven as well.

What is wrong with Vega10/Raven? The 32bit range works fine on Raven here.


Well there is nothing "wrong" with them, it's just that Vega10/Raven are 
the first generation with recoverable page faults and we probably want 
to use this for SVM.


This way you can basically use any CPU pointer on the GPU without 
further overhead. Should be pretty neat for uploads and quite a bunch of 
other use cases in OpenGL, not to mention OpenCL and Vulkan.


The crux is that we need to disallow any manual VA allocation when this 
is active or otherwise CPU and GPU address space handling could clash.


The address space on Vega10/Raven are divided in the same way it is on 
x86 CPUs. In other words you got a low address range from 
0x0-0x8000 and a high address range from 
0x8000-0x.


So the idea is that we free up the low range to be able to have a 1 to 1 
address mapping with the userspace CPU address space and either 
translate that using the ATC or HMM. And then use the high range for all 
manual VA allocations through libdrm.


See the patch "[PATCH libdrm 3/4] amdgpu: use the high VA range if 
possible v2" on the amd-gfx mailing list. This way the the 32bit range 
now starts at 0x8000 instead of 0x0. Please take a look 
and/or review them.


Regards,
Christian.



Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 00/15] RadeonSI 32-bit GPU pointers

2018-01-07 Thread Marek Olšák
On Sun, Jan 7, 2018 at 9:50 AM, Christian König
 wrote:
> Am 07.01.2018 um 01:48 schrieb Marek Olšák:
>>
>> On Sat, Jan 6, 2018 at 5:51 PM, Christian König
>>  wrote:
>>>
>>> Hi Marek,
>>>
>>> actually I was on the verge to remove the 32bit VM support in libdrm
>>> because
>>> it clashes with HMM and SVM in general.
>>>
>>> Is it possible to set the upper 32bit of the 64bit address to some fixed
>>> value instead?
>>
>> Yes, but not on radeon. radeon only has 8GB of virtual address space and
>> 4GB on older kernels. I would have to change LLVM to set the high bits
>> differently on amdgpu but keep the high bits 0 on radeon.
>
>
> That only matters on Vega10/Raven anyway.
>
> But in general being able to define what LLVM uses for the upper bits sounds
> like a good idea to me, this way we can keep the handling in Mesa.
>
> Going to send updated libdrm patches which keeps the 32bit range usable on
> Vega10/Raven as well.

What is wrong with Vega10/Raven? The 32bit range works fine on Raven here.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3 3/4] meson: build clover

2018-01-07 Thread Jan Vesely
On Sat, 2018-01-06 at 15:48 -0800, Dylan Baker wrote:
> Quoting Jan Vesely (2018-01-06 15:18:54)
> > On Fri, 2018-01-05 at 15:26 -0800, Dylan Baker wrote:
> > > Quoting Jan Vesely (2018-01-05 14:16:41)
> > > > Hi,
> > > > 
> > > > 
> > > > sorry for the delay. I was mostly traveling during the holidays.
> > > 
> > > No worries, I was also away over the holidays and didn't look at this 
> > > until
> > > today.
> > > 
> > > > 
> > > > On Fri, 2017-12-15 at 10:54 -0800, Dylan Baker wrote:
> > > > > This has only been compile tested.
> > > > > 
> > > > > v2: - Have a single option for opencl (Eric E)
> > > > > - fix typo "tgis" -> "tgsi" (Curro)
> > > > > - Don't add "lib" to pipe loader libraries, which matches the
> > > > >   autotools behavior
> > > > > v3: - Remove trailing whitespace
> > > > > - Make PIPE_SEARCH_DIR an absolute path
> > > > > 
> > > > > cc: Curro Jerez 
> > > > > cc: Jan Vesely 
> > > > > cc: Aaron Watry 
> > > > > Signed-off-by: Dylan Baker 
> > > > > ---
> > > > >  include/meson.build   |  19 
> > > > >  meson.build   |  29 +-
> > > > >  meson_options.txt |   7 ++
> > > > >  src/gallium/auxiliary/pipe-loader/meson.build |   3 +-
> > > > >  src/gallium/meson.build   |  12 ++-
> > > > >  src/gallium/state_trackers/clover/meson.build | 122 
> > > > > ++
> > > > >  src/gallium/targets/opencl/meson.build|  73 +++
> > > > >  src/gallium/targets/pipe-loader/meson.build   |  77 
> > > > >  8 files changed, 336 insertions(+), 6 deletions(-)
> > > > >  create mode 100644 src/gallium/state_trackers/clover/meson.build
> > > > >  create mode 100644 src/gallium/targets/opencl/meson.build
> > > > >  create mode 100644 src/gallium/targets/pipe-loader/meson.build
> > > > > 
> > > > > diff --git a/include/meson.build b/include/meson.build
> > > > > index e4dae91cede..a2e7ce6580e 100644
> > > > > --- a/include/meson.build
> > > > > +++ b/include/meson.build
> > > > > @@ -78,3 +78,22 @@ if with_gallium_st_nine
> > > > >  subdir : 'd3dadapter',
> > > > >)
> > > > >  endif
> > > > > +
> > > > > +# Only install the headers if we are building a stand alone 
> > > > > implementation and
> > > > > +# not an ICD enabled implementation
> > > > > +if with_gallium_opencl and not with_opencl_icd
> > > > > +  install_headers(
> > > > > +'CL/cl.h',
> > > > > +'CL/cl.hpp',
> > > > > +'CL/cl_d3d10.h',
> > > > > +'CL/cl_d3d11.h',
> > > > > +'CL/cl_dx9_media_sharing.h',
> > > > > +'CL/cl_egl.h',
> > > > > +'CL/cl_ext.h',
> > > > > +'CL/cl_gl.h',
> > > > > +'CL/cl_gl_ext.h',
> > > > > +'CL/cl_platform.h',
> > > > > +'CL/opencl.h',
> > > > > +subdir: 'CL'
> > > > > +  )
> > > > > +endif
> > > > > diff --git a/meson.build b/meson.build
> > > > > index 842d441199e..74b2d5c49dc 100644
> > > > > --- a/meson.build
> > > > > +++ b/meson.build
> > > > > @@ -583,6 +583,22 @@ if with_gallium_st_nine
> > > > >endif
> > > > >  endif
> > > > >  
> > > > > +_opencl = get_option('gallium-opencl')
> > > > > +if _opencl !=' disabled'
> > > > > +  if not with_gallium
> > > > > +error('OpenCL Clover implementation requires at least one 
> > > > > gallium driver.')
> > > > > +  endif
> > > > > +
> > > > > +  # TODO: alitvec?
> > > > > +  dep_clc = dependency('libclc')
> > > > > +  with_gallium_opencl = true
> > > > > +  with_opencl_icd = _opencl == 'icd'
> > > > > +else
> > > > > +  dep_clc = []
> > > > > +  with_gallium_opencl = false
> > > > > +  with_gallium_icd = false
> > > > > +endif
> > > > > +
> > > > >  gl_pkgconfig_c_flags = []
> > > > >  if with_platform_x11
> > > > >if with_any_vk or (with_glx == 'dri' and with_dri_platform == 
> > > > > 'drm')
> > > > > @@ -930,7 +946,7 @@ dep_thread = dependency('threads')
> > > > >  if dep_thread.found() and host_machine.system() != 'windows'
> > > > >pre_args += '-DHAVE_PTHREAD'
> > > > >  endif
> > > > > -if with_amd_vk or with_gallium_radeonsi or with_gallium_r600 # TODO: 
> > > > > clover
> > > > > +if with_amd_vk or with_gallium_radeonsi or with_gallium_r600 or 
> > > > > with_gallium_opencl
> > > > >dep_elf = dependency('libelf', required : false)
> > > > >if not dep_elf.found()
> > > > >  dep_elf = cc.find_library('elf')
> > > > > @@ -972,12 +988,19 @@ if with_amd_vk or with_gallium_radeonsi or 
> > > > > with_gallium_r600
> > > > >  llvm_modules += 'asmparser'
> > > > >endif
> > > > >  endif
> > > > > +if with_gallium_opencl
> > > > > +  llvm_modules += [
> > > > > +'all-targets', 'linker', 'coverage', 'instrumentation', 'ipo', 
> > > > > 'irreader',
> > > > > +'lto', 'option', 'objcarcopts', 'profiledata',
> > > > > +  ]
> > > > > +  # TODO: optional modules
> > > > > +endif
> > > > >  
> > > > >  _llvm = 

Re: [Mesa-dev] [PATCH] glsl: Respect std430 layout in lower_buffer_access

2018-01-07 Thread Anonymer Kommentator
Hi Timothy,

Thanks for taking a look.

> These changes seem reasonable. Are you able to create a piglit test that
> exercises the bug also?

Thanks to your pointer to the basic.shader_test for SSBOs in piglit, I
was able to create a new piglit test for this bug. Its result changes
from fail to pass after applying my patch. I'll send it to the piglit
mailing list for review. It's a nice and easy way to write OpenGL
tests!

PS: This is my first attempt to send a plain text e-mail in gmail. I
hope it works.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 103538] vkDestroySwapchain causes deadlock on Wayland compositor with X11

2018-01-07 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=103538

--- Comment #3 from mais...@archlinux.us ---
Also seen this issue now on Xorg. Also observed with the new AMD open driver,
could be an Xorg bug?

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 103999] 4x MSAA with RG32F shows garbage on triangle edges

2018-01-07 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=103999

--- Comment #5 from mais...@archlinux.us ---
Still buggy on latest git.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 00/15] RadeonSI 32-bit GPU pointers

2018-01-07 Thread Christian König

Am 07.01.2018 um 01:48 schrieb Marek Olšák:

On Sat, Jan 6, 2018 at 5:51 PM, Christian König
 wrote:

Hi Marek,

actually I was on the verge to remove the 32bit VM support in libdrm because
it clashes with HMM and SVM in general.

Is it possible to set the upper 32bit of the 64bit address to some fixed
value instead?

Yes, but not on radeon. radeon only has 8GB of virtual address space and
4GB on older kernels. I would have to change LLVM to set the high bits
differently on amdgpu but keep the high bits 0 on radeon.


That only matters on Vega10/Raven anyway.

But in general being able to define what LLVM uses for the upper bits 
sounds like a good idea to me, this way we can keep the handling in Mesa.


Going to send updated libdrm patches which keeps the 32bit range usable 
on Vega10/Raven as well.


Christian.



Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 104214] Dota crashes when switching from game to desktop

2018-01-07 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=104214

--- Comment #22 from Evangelos Foutras  ---
(In reply to Cyril from comment #11)
> Used git bisect (between tag mesa-17.3.1 and mesa-17.2.6) and tested if i
> could launch the game or not for each iteration. Got this commit at the end :
> 
> 15e208c4ccdd94582a459d0066b587f91caf270c is the first bad commit

I reached the same commit as the first commit that triggers segfaults with mpv
(see bug 104376 and its "see also" bugs).

The patch from comment 12 does *not* fix my mpv issue (testing on top of
master), so perhaps the actual issue is with commit 15e208c4cc? [1]

[1] https://cgit.freedesktop.org/mesa/mesa/commit/?id=15e208c4cc

-- 
You are receiving this mail because:
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev