Re: [Mesa-dev] [PATCH 1/3] android: anv/extensions: fix generated sources build
On 02/04/2018 11:57 PM, Mauro Rossi wrote: Building rules are aligned to automake ones The correct script to build anv_extensions.{c,h} is anv_extensions_gen.py Generation rules for anv_extensions.c requires --out-c option Generation rules for anv_extensions.h were missing Necessary include paths are added to avoid following build errors: cp: cannot stat '.../gen/STATIC_LIBRARIES/libmesa_vulkan_common_intermediates/vulkan/anv_extensions.c': No such file or directory failed to build some targets (01:24 (mm:ss)) In file included from external/mesa/src/intel/vulkan/anv_gem.c:32: external/mesa/src/intel/vulkan/anv_private.h:75:10: fatal error: 'anv_extensions.h' file not found #include "anv_extensions.h" ^~ 1 error generated. In file included from external/mesa/src/intel/vulkan/anv_batch_chain.c:30: external/mesa/src/intel/vulkan/anv_private.h:75:10: fatal error: 'anv_extensions.h' file not found #include "anv_extensions.h" ^~ 1 error generated. Fixes: ca6237e ("android: anv_extensions.c is generated to libmesa_vulkan_common") It does not fix this commit because back then '--out-c' or anv_extensions.h did not exist. Those were introduced later by commit dd088d4bec which this commit is fixing. With that changed; Reviewed-by: Tapani PälliCc: "18.0" --- src/intel/Android.vulkan.mk | 16 +--- 1 file changed, 13 insertions(+), 3 deletions(-) diff --git a/src/intel/Android.vulkan.mk b/src/intel/Android.vulkan.mk index 32b4892e17..5c8c947136 100644 --- a/src/intel/Android.vulkan.mk +++ b/src/intel/Android.vulkan.mk @@ -25,7 +25,7 @@ include $(LOCAL_PATH)/Makefile.sources VK_ENTRYPOINTS_SCRIPT := $(MESA_PYTHON2) $(LOCAL_PATH)/vulkan/anv_entrypoints_gen.py -VK_EXTENSIONS_SCRIPT := $(MESA_PYTHON2) $(LOCAL_PATH)/vulkan/anv_extensions.py +VK_EXTENSIONS_SCRIPT := $(MESA_PYTHON2) $(LOCAL_PATH)/vulkan/anv_extensions_gen.py VULKAN_COMMON_INCLUDES := \ $(MESA_TOP)/include \ @@ -82,6 +82,7 @@ ANV_INCLUDES := \ $(VULKAN_COMMON_INCLUDES) \ $(call generated-sources-dir-for,STATIC_LIBRARIES,libmesa_anv_entrypoints,,)/vulkan \ $(call generated-sources-dir-for,STATIC_LIBRARIES,libmesa_nir,,)/nir \ + $(call generated-sources-dir-for,STATIC_LIBRARIES,libmesa_vulkan_common,,)/vulkan \ $(call generated-sources-dir-for,STATIC_LIBRARIES,libmesa_vulkan_util,,)/util # @@ -212,6 +213,7 @@ LOCAL_WHOLE_STATIC_LIBRARIES := \ LOCAL_GENERATED_SOURCES += $(intermediates)/vulkan/anv_entrypoints.c LOCAL_GENERATED_SOURCES += $(intermediates)/vulkan/anv_extensions.c +LOCAL_GENERATED_SOURCES += $(intermediates)/vulkan/anv_extensions.h $(intermediates)/vulkan/anv_entrypoints.c: @mkdir -p $(dir $@) @@ -225,7 +227,14 @@ $(intermediates)/vulkan/anv_extensions.c: $(VK_EXTENSIONS_SCRIPT) \ --xml $(MESA_TOP)/src/vulkan/registry/vk.xml \ --xml $(MESA_TOP)/src/vulkan/registry/vk_android_native_buffer.xml \ - --out $@ + --out-c $@ + +$(intermediates)/vulkan/anv_extensions.h: + @mkdir -p $(dir $@) + $(VK_EXTENSIONS_SCRIPT) \ + --xml $(MESA_TOP)/src/vulkan/registry/vk.xml \ + --xml $(MESA_TOP)/src/vulkan/registry/vk_android_native_buffer.xml \ + --out-h $@ LOCAL_SHARED_LIBRARIES := libdrm @@ -252,7 +261,8 @@ LOCAL_SRC_FILES := \ LOCAL_C_INCLUDES := \ $(VULKAN_COMMON_INCLUDES) \ - $(call generated-sources-dir-for,STATIC_LIBRARIES,libmesa_anv_entrypoints,,)/vulkan + $(call generated-sources-dir-for,STATIC_LIBRARIES,libmesa_anv_entrypoints,,)/vulkan \ + $(call generated-sources-dir-for,STATIC_LIBRARIES,libmesa_vulkan_common,,)/vulkan LOCAL_EXPORT_C_INCLUDE_DIRS := $(MESA_TOP)/src/intel/vulkan ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Enable disk shader cache by default
Reviewed-by: Tapani PälliOn 02/03/2018 11:58 PM, Jordan Justen wrote: Signed-off-by: Jordan Justen Reviewed-by: Timothy Arceri --- docs/relnotes/18.1.0.html | 1 + src/mesa/drivers/dri/i965/brw_disk_cache.c | 3 --- 2 files changed, 1 insertion(+), 3 deletions(-) diff --git a/docs/relnotes/18.1.0.html b/docs/relnotes/18.1.0.html index b8a0cd0d02c..0a5878ea41f 100644 --- a/docs/relnotes/18.1.0.html +++ b/docs/relnotes/18.1.0.html @@ -46,6 +46,7 @@ Note: some of the new features are only available with certain drivers. GL_EXT_semaphore on radeonsi GL_EXT_semaphore_fd on radeonsi +Disk shader cache support for i965 enabled by default Bug fixes diff --git a/src/mesa/drivers/dri/i965/brw_disk_cache.c b/src/mesa/drivers/dri/i965/brw_disk_cache.c index f989456bcde..41f742e858f 100644 --- a/src/mesa/drivers/dri/i965/brw_disk_cache.c +++ b/src/mesa/drivers/dri/i965/brw_disk_cache.c @@ -407,9 +407,6 @@ void brw_disk_cache_init(struct intel_screen *screen) { #ifdef ENABLE_SHADER_CACHE - if (env_var_as_boolean("MESA_GLSL_CACHE_DISABLE", true)) - return; - char renderer[10]; MAYBE_UNUSED int len = snprintf(renderer, sizeof(renderer), "i965_%04x", screen->deviceID); ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2] glsl/tests: changes to test_disk_cache_create test
Next patch will allow disk_cache instance to be created without path set for it, modify some test cases that assume disk_cache creation to fail with invalid path. Creation should succeed but simple put/get test fail. v2: leave tests as is but check that both cache struct exists and try simple put/get that should fail with invalid path set (Emil) Signed-off-by: Tapani PälliReviewed-by: Jordan Justen (v1) --- src/compiler/glsl/tests/cache_test.c | 28 ++-- 1 file changed, 22 insertions(+), 6 deletions(-) diff --git a/src/compiler/glsl/tests/cache_test.c b/src/compiler/glsl/tests/cache_test.c index dd11fd5944..3edd88b06a 100644 --- a/src/compiler/glsl/tests/cache_test.c +++ b/src/compiler/glsl/tests/cache_test.c @@ -182,6 +182,20 @@ wait_until_file_written(struct disk_cache *cache, const cache_key key) } } +static void * +cache_exists(struct disk_cache *cache) +{ + uint8_t dummy_key[20]; + char *data = "some test data"; + + if (!cache) + return NULL; + + disk_cache_put(cache, dummy_key, data, sizeof(data), NULL); + wait_until_file_written(cache, dummy_key); + return disk_cache_get(cache, dummy_key, NULL); +} + #define CACHE_TEST_TMP "./cache-test-tmp" static void @@ -213,12 +227,13 @@ test_disk_cache_create(void) /* Test with XDG_CACHE_HOME set */ setenv("XDG_CACHE_HOME", CACHE_TEST_TMP "/xdg-cache-home", 1); cache = disk_cache_create("test", "make_check", 0); - expect_null(cache, "disk_cache_create with XDG_CACHE_HOME set with" - "a non-existing parent directory"); + expect_null(cache_exists(cache), "disk_cache_create with XDG_CACHE_HOME set " + "with a non-existing parent directory"); mkdir(CACHE_TEST_TMP, 0755); cache = disk_cache_create("test", "make_check", 0); - expect_non_null(cache, "disk_cache_create with XDG_CACHE_HOME set"); + expect_non_null(cache_exists(cache), "disk_cache_create with XDG_CACHE_HOME " + "set"); check_directories_created(CACHE_TEST_TMP "/xdg-cache-home/" CACHE_DIR_NAME); @@ -231,12 +246,13 @@ test_disk_cache_create(void) setenv("MESA_GLSL_CACHE_DIR", CACHE_TEST_TMP "/mesa-glsl-cache-dir", 1); cache = disk_cache_create("test", "make_check", 0); - expect_null(cache, "disk_cache_create with MESA_GLSL_CACHE_DIR set with" - "a non-existing parent directory"); + expect_null(cache_exists(cache), "disk_cache_create with MESA_GLSL_CACHE_DIR" + " set with a non-existing parent directory"); mkdir(CACHE_TEST_TMP, 0755); cache = disk_cache_create("test", "make_check", 0); - expect_non_null(cache, "disk_cache_create with MESA_GLSL_CACHE_DIR set"); + expect_non_null(cache_exists(cache), "disk_cache_create with " + "MESA_GLSL_CACHE_DIR set"); check_directories_created(CACHE_TEST_TMP "/mesa-glsl-cache-dir/" CACHE_DIR_NAME); -- 2.14.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] spirv: split constant initializers on in/out structs
This is still unreviewed. Jason, since you reviewed the other patch I sent related to output initializers, could you have a look at this one too? Iago On Tue, 2018-01-23 at 14:11 +0100, Iago Toral Quiroga wrote: > The SPIR-V parser splits in/out struct variables and creates > a separate variable for each first-level member of the struct. > When the struct variable has an initializer this means that we also > need to split the initializer. > --- > src/compiler/spirv/vtn_variables.c | 8 > 1 file changed, 8 insertions(+) > > diff --git a/src/compiler/spirv/vtn_variables.c > b/src/compiler/spirv/vtn_variables.c > index eb306d0c4a..ead68b4784 100644 > --- a/src/compiler/spirv/vtn_variables.c > +++ b/src/compiler/spirv/vtn_variables.c > @@ -1837,7 +1837,15 @@ vtn_create_variable(struct vtn_builder *b, > struct vtn_value *val, > interface_type->members[i]->type; > var->members[i]->data.mode = nir_mode; > var->members[i]->data.patch = var->patch; > + > +if (initializer) { > + assert(i < initializer->num_elements); > + var->members[i]->constant_initializer = > + nir_constant_clone(initializer->elements[i], var- > >members[i]); > +} > } > + > + initializer = NULL; >} else { > var->var = rzalloc(b->shader, nir_variable); > var->var->name = ralloc_strdup(var->var, val->name); ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v3] egl: add support for EGL_ANDROID_blob_cache
v2: cleanup, move callbacks to _egl_display struct (Emil Velikov) adapt to earlier ctx->screen changes v3: remove useless checking, add _eglSetFuncName (Emil Velikov) Signed-off-by: Tapani PälliReviewed-by: Jordan Justen (v2) --- src/egl/drivers/dri2/egl_dri2.c | 16 src/egl/drivers/dri2/egl_dri2.h | 1 + src/egl/main/eglapi.c | 42 + src/egl/main/eglapi.h | 4 src/egl/main/egldisplay.h | 4 src/egl/main/eglentrypoint.h| 1 + 6 files changed, 68 insertions(+) diff --git a/src/egl/drivers/dri2/egl_dri2.c b/src/egl/drivers/dri2/egl_dri2.c index d5a4f72e86..e9b556ec5f 100644 --- a/src/egl/drivers/dri2/egl_dri2.c +++ b/src/egl/drivers/dri2/egl_dri2.c @@ -458,6 +458,7 @@ static const struct dri2_extension_match optional_core_extensions[] = { { __DRI2_INTEROP, 1, offsetof(struct dri2_egl_display, interop) }, { __DRI_IMAGE, 1, offsetof(struct dri2_egl_display, image) }, { __DRI2_FLUSH_CONTROL, 1, offsetof(struct dri2_egl_display, flush_control) }, + { __DRI2_BLOB, 1, offsetof(struct dri2_egl_display, blob) }, { NULL, 0, 0 } }; @@ -727,6 +728,9 @@ dri2_setup_screen(_EGLDisplay *disp) } } + if (dri2_dpy->blob) + disp->Extensions.ANDROID_blob_cache = EGL_TRUE; + disp->Extensions.KHR_reusable_sync = EGL_TRUE; if (dri2_dpy->image) { @@ -3016,6 +3020,17 @@ dri2_dup_native_fence_fd(_EGLDriver *drv, _EGLDisplay *dpy, _EGLSync *sync) return dup(sync->SyncFd); } +static void +dri2_set_blob_cache_funcs(_EGLDriver *drv, _EGLDisplay *dpy, + EGLSetBlobFuncANDROID set, + EGLGetBlobFuncANDROID get) +{ + struct dri2_egl_display *dri2_dpy = dri2_egl_display(dpy); + dri2_dpy->blob->set_cache_funcs(dri2_dpy->dri_screen, + dpy->BlobCacheSet, + dpy->BlobCacheGet); +} + static EGLint dri2_client_wait_sync(_EGLDriver *drv, _EGLDisplay *dpy, _EGLSync *sync, EGLint flags, EGLTime timeout) @@ -3234,6 +3249,7 @@ _eglBuiltInDriver(void) dri2_drv->API.GLInteropQueryDeviceInfo = dri2_interop_query_device_info; dri2_drv->API.GLInteropExportObject = dri2_interop_export_object; dri2_drv->API.DupNativeFenceFDANDROID = dri2_dup_native_fence_fd; + dri2_drv->API.SetBlobCacheFuncsANDROID = dri2_set_blob_cache_funcs; dri2_drv->Name = "DRI2"; diff --git a/src/egl/drivers/dri2/egl_dri2.h b/src/egl/drivers/dri2/egl_dri2.h index cc76c73eab..c49156fbb6 100644 --- a/src/egl/drivers/dri2/egl_dri2.h +++ b/src/egl/drivers/dri2/egl_dri2.h @@ -171,6 +171,7 @@ struct dri2_egl_display const __DRInoErrorExtension*no_error; const __DRI2configQueryExtension *config; const __DRI2fenceExtension *fence; + const __DRI2blobExtension *blob; const __DRI2rendererQueryExtension *rendererQuery; const __DRI2interopExtension *interop; int fd; diff --git a/src/egl/main/eglapi.c b/src/egl/main/eglapi.c index 5110688f2d..2d2a6bce3f 100644 --- a/src/egl/main/eglapi.c +++ b/src/egl/main/eglapi.c @@ -476,6 +476,7 @@ _eglCreateExtensionsString(_EGLDisplay *dpy) char *exts = dpy->ExtensionsString; /* Please keep these sorted alphabetically. */ + _EGL_CHECK_EXTENSION(ANDROID_blob_cache); _EGL_CHECK_EXTENSION(ANDROID_framebuffer_target); _EGL_CHECK_EXTENSION(ANDROID_image_native_buffer); _EGL_CHECK_EXTENSION(ANDROID_native_fence_sync); @@ -2522,6 +2523,47 @@ eglQueryDmaBufModifiersEXT(EGLDisplay dpy, EGLint format, EGLint max_modifiers, RETURN_EGL_EVAL(disp, ret); } +static void EGLAPIENTRY +eglSetBlobCacheFuncsANDROID(EGLDisplay *dpy, EGLSetBlobFuncANDROID set, +EGLGetBlobFuncANDROID get) +{ + /* This function does not return anything so we cannot +* utilize the helper macros _EGL_FUNC_START or _EGL_CHECK_DISPLAY. +*/ + _EGLDisplay *disp = _eglLockDisplay(dpy); + if (!_eglSetFuncName(__func__, disp, EGL_OBJECT_DISPLAY_KHR, NULL)) { + if (disp) { + _eglUnlockDisplay(disp); + return; + } + } + + _EGLDriver *drv = _eglCheckDisplay(disp, __func__); + if (!drv) + return; + + if (!set || !get) { + _eglError(EGL_BAD_PARAMETER, +"eglSetBlobCacheFuncsANDROID: NULL handler given"); + _eglUnlockDisplay(disp); + return; + } + + if (disp->BlobCacheSet) { + _eglError(EGL_BAD_PARAMETER, +"eglSetBlobCacheFuncsANDROID: functions already set"); + _eglUnlockDisplay(disp); + return; + } + + disp->BlobCacheSet = set; + disp->BlobCacheGet = get; + + drv->API.SetBlobCacheFuncsANDROID(drv, disp, set, get); + + _eglUnlockDisplay(disp); +} + __eglMustCastToProperFunctionPointerType EGLAPIENTRY eglGetProcAddress(const char *procname) { diff --git a/src/egl/main/eglapi.h
Re: [Mesa-dev] [PATCH 1/2] intel/compiler: fix first_component for 64-bit types on vertex inputs
This series is still waiting for a review, any takers? On Fri, 2018-01-19 at 09:17 +0100, Iago Toral Quiroga wrote: > Divide it by two as we do for other stages. This is because the > component layout qualifier is always in 32-bit units. > > Fixes issues in a new CTS test (still WIP): > KHR-GL45.enhanced_layouts.varying_double_components > --- > src/intel/compiler/brw_fs_nir.cpp | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/src/intel/compiler/brw_fs_nir.cpp > b/src/intel/compiler/brw_fs_nir.cpp > index 0d775649303..7a6346a4b5d 100644 > --- a/src/intel/compiler/brw_fs_nir.cpp > +++ b/src/intel/compiler/brw_fs_nir.cpp > @@ -2420,6 +2420,9 @@ fs_visitor::nir_emit_vs_intrinsic(const > fs_builder , >assert(const_offset && "Indirect input loads not allowed"); >src = offset(src, bld, const_offset->u32[0]); > > + if (type_sz(type) == 8) > + first_component /= 2; > + >for (unsigned j = 0; j < num_components; j++) { > bld.MOV(offset(dest, bld, j), offset(src, bld, j + > first_component)); >} ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] r600/atomic: fix ATOMCAS instruction.
From: Dave AirlieThis has 3 srcs. This fixes: KHR-GL45.shader_atomic_counter_ops_tests.ShaderAtomicCounterOpsExchangeTestCase Signed-off-by: Dave Airlie --- src/gallium/drivers/r600/r600_shader.c | 32 +++- 1 file changed, 31 insertions(+), 1 deletion(-) diff --git a/src/gallium/drivers/r600/r600_shader.c b/src/gallium/drivers/r600/r600_shader.c index 33eb5accea..4c0d554d1a 100644 --- a/src/gallium/drivers/r600/r600_shader.c +++ b/src/gallium/drivers/r600/r600_shader.c @@ -8698,6 +8698,33 @@ static int tgsi_atomic_op_gds(struct r600_shader_ctx *ctx) if (r) return r; + if (gds_op == FETCH_OP_GDS_CMP_XCHG_RET) { + if (inst->Src[3].Register.File == TGSI_FILE_IMMEDIATE) { + int value = (ctx->literals[4 * inst->Src[3].Register.Index + inst->Src[3].Register.SwizzleX]); + memset(, 0, sizeof(struct r600_bytecode_alu)); + alu.op = ALU_OP1_MOV; + alu.dst.sel = ctx->temp_reg; + alu.dst.chan = is_cm ? 2 : 1; + alu.src[0].sel = V_SQ_ALU_SRC_LITERAL; + alu.src[0].value = value; + alu.last = 1; + alu.dst.write = 1; + r = r600_bytecode_add_alu(ctx->bc, ); + if (r) + return r; + } else { + memset(, 0, sizeof(struct r600_bytecode_alu)); + alu.op = ALU_OP1_MOV; + alu.dst.sel = ctx->temp_reg; + alu.dst.chan = is_cm ? 2 : 1; + r600_bytecode_src([0], >src[3], 0); + alu.last = 1; + alu.dst.write = 1; + r = r600_bytecode_add_alu(ctx->bc, ); + if (r) + return r; + } + } if (inst->Src[2].Register.File == TGSI_FILE_IMMEDIATE) { int value = (ctx->literals[4 * inst->Src[2].Register.Index + inst->Src[2].Register.SwizzleX]); int abs_value = abs(value); @@ -8737,7 +8764,10 @@ static int tgsi_atomic_op_gds(struct r600_shader_ctx *ctx) gds.src_gpr2 = 0; gds.src_sel_x = is_cm ? 0 : 4; gds.src_sel_y = is_cm ? 1 : 0; - gds.src_sel_z = 7; + if (gds_op == FETCH_OP_GDS_CMP_XCHG_RET) + gds.src_sel_z = is_cm ? 2 : 1; + else + gds.src_sel_z = 7; gds.dst_sel_x = 0; gds.dst_sel_y = 7; gds.dst_sel_z = 7; -- 2.14.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] r600/sb/cayman: fix indirect ubo access on cayman
From: Dave AirlieWith sb enabled on cayman, this was overwriting the proper cf index value with random ones if the dst gpr was 2 or 3, only save the value for a MOVA instruction. Fixes: KHR-GL45.gpu_shader5.uniform_blocks_array_indexing (on cayman with sb) Signed-off-by: Dave Airlie --- src/gallium/drivers/r600/sb/sb_bc_parser.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/gallium/drivers/r600/sb/sb_bc_parser.cpp b/src/gallium/drivers/r600/sb/sb_bc_parser.cpp index 970e4141d5..87035ee2a6 100644 --- a/src/gallium/drivers/r600/sb/sb_bc_parser.cpp +++ b/src/gallium/drivers/r600/sb/sb_bc_parser.cpp @@ -567,7 +567,7 @@ int bc_parser::prepare_alu_group(cf_node* cf, alu_group_node *g) { n->src.push_back(get_cf_index_value(1)); } - if ((n->bc.dst_gpr == CM_V_SQ_MOVA_DST_CF_IDX0 || n->bc.dst_gpr == CM_V_SQ_MOVA_DST_CF_IDX1) && + if ((flags & AF_MOVA) && (n->bc.dst_gpr == CM_V_SQ_MOVA_DST_CF_IDX0 || n->bc.dst_gpr == CM_V_SQ_MOVA_DST_CF_IDX1) && ctx.is_cayman()) // Move CF_IDX value into tex instruction operands, scheduler will later re-emit setting of CF_IDX save_set_cf_index(n->src[0], n->bc.dst_gpr == CM_V_SQ_MOVA_DST_CF_IDX1); -- 2.14.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] nvc0: collapse output slots to have adjacent registers
The hardware skips over unallocated slots, so we have to make sure those registers are packed together. Fixes KHR-GL45.enhanced_layouts.fragment_data_location_api Signed-off-by: Ilia Mirkin--- Tested on GK208. Needs testing on Fermi, as I seem to recall it had slightly different semantics from Kepler. They might need slightly different logic. src/gallium/drivers/nouveau/nvc0/nvc0_program.c | 14 -- 1 file changed, 12 insertions(+), 2 deletions(-) diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_program.c b/src/gallium/drivers/nouveau/nvc0/nvc0_program.c index e6157f550d6..9520d984bb3 100644 --- a/src/gallium/drivers/nouveau/nvc0/nvc0_program.c +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_program.c @@ -134,10 +134,20 @@ nvc0_fp_assign_output_slots(struct nv50_ir_prog_info *info) unsigned count = info->prop.fp.numColourResults * 4; unsigned i, c; + /* Compute the relative position of each color output, since skipped MRT +* positions will not have registers allocated to them. +*/ + unsigned colors[8] = {0}; + for (i = 0; i < info->numOutputs; ++i) + if (info->out[i].sn == TGSI_SEMANTIC_COLOR) + colors[info->out[i].si] = 1; + for (i = 0, c = 0; i < 8; i++) + if (colors[i]) + colors[i] = c++; for (i = 0; i < info->numOutputs; ++i) if (info->out[i].sn == TGSI_SEMANTIC_COLOR) for (c = 0; c < 4; ++c) -info->out[i].slot[c] = info->out[i].si * 4 + c; +info->out[i].slot[c] = colors[info->out[i].si] * 4 + c; if (info->io.sampleMask < PIPE_MAX_SHADER_OUTPUTS) info->out[info->io.sampleMask].slot[0] = count++; @@ -474,7 +484,7 @@ nvc0_fp_gen_header(struct nvc0_program *fp, struct nv50_ir_prog_info *info) for (i = 0; i < info->numOutputs; ++i) { if (info->out[i].sn == TGSI_SEMANTIC_COLOR) - fp->hdr[18] |= 0xf << info->out[i].slot[0]; + fp->hdr[18] |= 0xf << (4 * info->out[i].si); } /* There are no "regular" attachments, but the shader still needs to be -- 2.13.6 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] r600: fixup sparse color exports.
From: Dave AirlieIf we have gaps in the shader mask we have to have 0x1 in them according to a comment in radeonsi, and this is required to fix the test at least on cayman. We also need to record the highest one written to write to the ps exports reg. This fixes: KHR-GL45.enhanced_layouts.fragment_data_location_api Signed-off-by: Dave Airlie --- src/gallium/drivers/r600/evergreen_state.c | 2 +- src/gallium/drivers/r600/r600_shader.c | 10 ++ src/gallium/drivers/r600/r600_shader.h | 1 + 3 files changed, 12 insertions(+), 1 deletion(-) diff --git a/src/gallium/drivers/r600/evergreen_state.c b/src/gallium/drivers/r600/evergreen_state.c index 4c9163c2a7..742ca5babb 100644 --- a/src/gallium/drivers/r600/evergreen_state.c +++ b/src/gallium/drivers/r600/evergreen_state.c @@ -3369,7 +3369,7 @@ void evergreen_update_ps_state(struct pipe_context *ctx, struct r600_pipe_shader exports_ps |= 1; } - num_cout = rshader->nr_ps_color_exports; + num_cout = rshader->ps_export_highest + 1; exports_ps |= S_02884C_EXPORT_COLORS(num_cout); if (!exports_ps) { diff --git a/src/gallium/drivers/r600/r600_shader.c b/src/gallium/drivers/r600/r600_shader.c index 72e3063804..33eb5accea 100644 --- a/src/gallium/drivers/r600/r600_shader.c +++ b/src/gallium/drivers/r600/r600_shader.c @@ -3876,6 +3876,16 @@ static int r600_shader_from_tgsi(struct r600_context *rctx, output[j].type = V_SQ_CF_ALLOC_EXPORT_WORD0_SQ_EXPORT_PIXEL; shader->nr_ps_color_exports++; shader->ps_color_export_mask |= (0xf << (shader->output[i].sid * 4)); + + /* If the i-th target format is set, all previous target formats must +* be non-zero to avoid hangs. - from radeonsi, seems to apply to eg as well. +*/ + if (shader->output[i].sid > 0) + for (unsigned x = 0; x < shader->output[i].sid; x++) + shader->ps_color_export_mask |= (1 << (x*4)); + + if (shader->output[i].sid > shader->ps_export_highest) + shader->ps_export_highest = shader->output[i].sid; if (shader->fs_write_all && (rscreen->b.chip_class >= EVERGREEN)) { for (k = 1; k < max_color_exports; k++) { j++; diff --git a/src/gallium/drivers/r600/r600_shader.h b/src/gallium/drivers/r600/r600_shader.h index 7fca3f455e..4b23facf6f 100644 --- a/src/gallium/drivers/r600/r600_shader.h +++ b/src/gallium/drivers/r600/r600_shader.h @@ -85,6 +85,7 @@ struct r600_shader { /* Real number of ps color exports compiled in the bytecode */ unsignednr_ps_color_exports; unsignedps_color_export_mask; + unsignedps_export_highest; /* bit n is set if the shader writes gl_ClipDistance[n] */ unsignedcc_dist_mask; unsignedclip_dist_write; -- 2.14.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 104928] libglvnd_1.0.0 disables amdgpu direct rendering
https://bugs.freedesktop.org/show_bug.cgi?id=104928 fin4...@hotmail.com changed: What|Removed |Added Status|RESOLVED|VERIFIED Resolution|INVALID |FIXED --- Comment #4 from fin4...@hotmail.com --- Installing libegl-mesa0 package from experimental repositry fixes the bug. https://packages.debian.org/experimental/libegl-mesa0 -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 9/9] r600: work out target mask at framebuffer bind.
From: Dave AirlieIf we only get 1,2,3,6 framebuffers we want a sparse target mask. Signed-off-by: Dave Airlie --- src/gallium/drivers/r600/evergreen_state.c | 10 +++--- src/gallium/drivers/r600/r600_pipe.h | 1 + src/gallium/drivers/r600/r600_state.c | 2 +- 3 files changed, 9 insertions(+), 4 deletions(-) diff --git a/src/gallium/drivers/r600/evergreen_state.c b/src/gallium/drivers/r600/evergreen_state.c index f8042c21c0..4c9163c2a7 100644 --- a/src/gallium/drivers/r600/evergreen_state.c +++ b/src/gallium/drivers/r600/evergreen_state.c @@ -1436,7 +1436,7 @@ static void evergreen_set_framebuffer_state(struct pipe_context *ctx, struct r600_surface *surf; struct r600_texture *rtex; uint32_t i, log_samples; - + uint32_t target_mask = 0; /* Flush TC when changing the framebuffer state, because the only * client not using TC that can change textures is the framebuffer. * Other places don't typically have to flush TC. @@ -1463,6 +1463,8 @@ static void evergreen_set_framebuffer_state(struct pipe_context *ctx, if (!surf) continue; + target_mask |= (0xf << (i * 4)); + rtex = (struct r600_texture*)surf->base.texture; r600_context_add_resource_size(ctx, state->cbufs[i]->texture); @@ -1528,7 +1530,9 @@ static void evergreen_set_framebuffer_state(struct pipe_context *ctx, r600_mark_atom_dirty(rctx, >db_misc_state.atom); } - if (rctx->cb_misc_state.nr_cbufs != state->nr_cbufs) { + if (rctx->cb_misc_state.nr_cbufs != state->nr_cbufs || + rctx->cb_misc_state.bound_cbufs_target_mask != target_mask) { + rctx->cb_misc_state.bound_cbufs_target_mask = target_mask; rctx->cb_misc_state.nr_cbufs = state->nr_cbufs; r600_mark_atom_dirty(rctx, >cb_misc_state.atom); } @@ -2025,7 +2029,7 @@ static void evergreen_emit_cb_misc_state(struct r600_context *rctx, struct r600_ { struct radeon_winsys_cs *cs = rctx->b.gfx.cs; struct r600_cb_misc_state *a = (struct r600_cb_misc_state*)atom; - unsigned fb_colormask = (1ULL << ((unsigned)a->nr_cbufs * 4)) - 1; + unsigned fb_colormask = a->bound_cbufs_target_mask; unsigned ps_colormask = a->ps_color_export_mask; unsigned rat_colormask = evergreen_construct_rat_mask(rctx, a, a->nr_cbufs); radeon_set_context_reg_seq(cs, R_028238_CB_TARGET_MASK, 2); diff --git a/src/gallium/drivers/r600/r600_pipe.h b/src/gallium/drivers/r600/r600_pipe.h index 9b94f3654c..9caf3b8512 100644 --- a/src/gallium/drivers/r600/r600_pipe.h +++ b/src/gallium/drivers/r600/r600_pipe.h @@ -152,6 +152,7 @@ struct r600_cb_misc_state { unsigned cb_color_control; /* this comes from blend state */ unsigned blend_colormask; /* 8*4 bits for 8 RGBA colorbuffers */ unsigned nr_cbufs; + unsigned bound_cbufs_target_mask; unsigned nr_ps_color_outputs; unsigned ps_color_export_mask; unsigned image_rat_enabled_mask; diff --git a/src/gallium/drivers/r600/r600_state.c b/src/gallium/drivers/r600/r600_state.c index 6ff8037d9c..5cf99c18b6 100644 --- a/src/gallium/drivers/r600/r600_state.c +++ b/src/gallium/drivers/r600/r600_state.c @@ -1525,7 +1525,7 @@ static void r600_emit_cb_misc_state(struct r600_context *rctx, struct r600_atom } radeon_set_context_reg(cs, R_028808_CB_COLOR_CONTROL, a->cb_color_control); } else { - unsigned fb_colormask = (1ULL << ((unsigned)a->nr_cbufs * 4)) - 1; + unsigned fb_colormask = a->bound_cbufs_target_mask; unsigned ps_colormask = a->ps_color_export_mask; unsigned multiwrite = a->multiwrite && a->nr_cbufs > 1; -- 2.14.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/9] r600: overhaul buffer resource query.
From: Dave AirlieThis cleans up and fixes the previous fix even more. Buffers from textures start at max const, buffers from buffers/images come in from the 168 offset. This fixes a bunch of: KHR-GL45.shader_storage_buffer_object* Signed-off-by: Dave Airlie --- src/gallium/drivers/r600/r600_shader.c | 15 --- 1 file changed, 8 insertions(+), 7 deletions(-) diff --git a/src/gallium/drivers/r600/r600_shader.c b/src/gallium/drivers/r600/r600_shader.c index 8c4460a5d5..32f24c071d 100644 --- a/src/gallium/drivers/r600/r600_shader.c +++ b/src/gallium/drivers/r600/r600_shader.c @@ -7007,7 +7007,7 @@ static int do_vtx_fetch_inst(struct r600_shader_ctx *ctx, boolean src_requires_l return 0; } -static int r600_do_buffer_txq(struct r600_shader_ctx *ctx, int reg_idx, int offset) +static int r600_do_buffer_txq(struct r600_shader_ctx *ctx, int reg_idx, int offset, int eg_buffer_base) { struct tgsi_full_instruction *inst = >parse.FullToken.FullInstruction; int r; @@ -7033,7 +7033,7 @@ static int r600_do_buffer_txq(struct r600_shader_ctx *ctx, int reg_idx, int offs struct r600_bytecode_vtx vtx; memset(, 0, sizeof(vtx)); vtx.op = FETCH_OP_GET_BUFFER_RESINFO; - vtx.buffer_id = id + R600_MAX_CONST_BUFFERS; + vtx.buffer_id = id + eg_buffer_base; vtx.fetch_type = SQ_VTX_FETCH_NO_INDEX_OFFSET; vtx.src_gpr = 0; vtx.mega_fetch_count = 16; /* no idea here really... */ @@ -7107,7 +7107,7 @@ static int tgsi_tex(struct r600_shader_ctx *ctx) if (inst->Instruction.Opcode == TGSI_OPCODE_TXQ) { if (ctx->bc->chip_class < EVERGREEN) ctx->shader->uses_tex_buffers = true; - return r600_do_buffer_txq(ctx, 1, 0); + return r600_do_buffer_txq(ctx, 1, 0, R600_MAX_CONST_BUFFERS); } else if (inst->Instruction.Opcode == TGSI_OPCODE_TXF) { if (ctx->bc->chip_class < EVERGREEN) @@ -8821,10 +8821,11 @@ static int tgsi_resq(struct r600_shader_ctx *ctx) (inst->Src[0].Register.File == TGSI_FILE_IMAGE && inst->Memory.Texture == TGSI_TEXTURE_BUFFER)) { if (ctx->bc->chip_class < EVERGREEN) ctx->shader->uses_tex_buffers = true; - unsigned offset = 0; - if (inst->Src[0].Register.File == TGSI_FILE_IMAGE) - offset += R600_IMAGE_REAL_RESOURCE_OFFSET - R600_MAX_CONST_BUFFERS + ctx->shader->image_size_const_offset; - return r600_do_buffer_txq(ctx, 0, offset); + unsigned eg_buffer_base = 0; + eg_buffer_base = R600_IMAGE_REAL_RESOURCE_OFFSET; + if (inst->Src[0].Register.File == TGSI_FILE_BUFFER) + eg_buffer_base += ctx->info.file_count[TGSI_FILE_IMAGE]; + return r600_do_buffer_txq(ctx, 0, ctx->shader->image_size_const_offset, eg_buffer_base); } if (inst->Memory.Texture == TGSI_TEXTURE_CUBE_ARRAY && -- 2.14.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/9] r600/images: set offset for compute shaders with number of declared samplers
From: Dave Airliefor frag shaders we get a value in the key, I expect I need to make compute work better --- src/gallium/drivers/r600/r600_shader.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/gallium/drivers/r600/r600_shader.c b/src/gallium/drivers/r600/r600_shader.c index 37c90b8b7d..8c4460a5d5 100644 --- a/src/gallium/drivers/r600/r600_shader.c +++ b/src/gallium/drivers/r600/r600_shader.c @@ -3247,7 +3247,7 @@ static int r600_shader_from_tgsi(struct r600_context *rctx, break; case PIPE_SHADER_COMPUTE: shader->rat_base = 0; - shader->image_size_const_offset = 0; + shader->image_size_const_offset = ctx.info.file_count[TGSI_FILE_SAMPLER]; break; default: break; -- 2.14.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 8/9] r600: work out shader export mask at shader build time
From: Dave AirlieSince enhanced layouts allows setting specific MRT outputs, we can get sparse outputs, so we have to calculate the shader mask earlier. Signed-off-by: Dave Airlie --- src/gallium/drivers/r600/evergreen_state.c | 3 ++- src/gallium/drivers/r600/r600_pipe.h | 1 + src/gallium/drivers/r600/r600_shader.c | 3 +++ src/gallium/drivers/r600/r600_shader.h | 3 +++ src/gallium/drivers/r600/r600_state.c| 2 +- src/gallium/drivers/r600/r600_state_common.c | 1 + 6 files changed, 11 insertions(+), 2 deletions(-) diff --git a/src/gallium/drivers/r600/evergreen_state.c b/src/gallium/drivers/r600/evergreen_state.c index 11e473d604..f8042c21c0 100644 --- a/src/gallium/drivers/r600/evergreen_state.c +++ b/src/gallium/drivers/r600/evergreen_state.c @@ -2026,7 +2026,7 @@ static void evergreen_emit_cb_misc_state(struct r600_context *rctx, struct r600_ struct radeon_winsys_cs *cs = rctx->b.gfx.cs; struct r600_cb_misc_state *a = (struct r600_cb_misc_state*)atom; unsigned fb_colormask = (1ULL << ((unsigned)a->nr_cbufs * 4)) - 1; - unsigned ps_colormask = (1ULL << ((unsigned)a->nr_ps_color_outputs * 4)) - 1; + unsigned ps_colormask = a->ps_color_export_mask; unsigned rat_colormask = evergreen_construct_rat_mask(rctx, a, a->nr_cbufs); radeon_set_context_reg_seq(cs, R_028238_CB_TARGET_MASK, 2); radeon_emit(cs, (a->blend_colormask & fb_colormask) | rat_colormask); /* R_028238_CB_TARGET_MASK */ @@ -3373,6 +3373,7 @@ void evergreen_update_ps_state(struct pipe_context *ctx, struct r600_pipe_shader exports_ps = 2; } shader->nr_ps_color_outputs = num_cout; + shader->ps_color_export_mask = rshader->ps_color_export_mask; if (ninterp == 0) { ninterp = 1; have_perspective = TRUE; diff --git a/src/gallium/drivers/r600/r600_pipe.h b/src/gallium/drivers/r600/r600_pipe.h index 0b772b2599..9b94f3654c 100644 --- a/src/gallium/drivers/r600/r600_pipe.h +++ b/src/gallium/drivers/r600/r600_pipe.h @@ -153,6 +153,7 @@ struct r600_cb_misc_state { unsigned blend_colormask; /* 8*4 bits for 8 RGBA colorbuffers */ unsigned nr_cbufs; unsigned nr_ps_color_outputs; + unsigned ps_color_export_mask; unsigned image_rat_enabled_mask; unsigned buffer_rat_enabled_mask; bool multiwrite; diff --git a/src/gallium/drivers/r600/r600_shader.c b/src/gallium/drivers/r600/r600_shader.c index 893a71b915..9984e783b5 100644 --- a/src/gallium/drivers/r600/r600_shader.c +++ b/src/gallium/drivers/r600/r600_shader.c @@ -3875,6 +3875,7 @@ static int r600_shader_from_tgsi(struct r600_context *rctx, output[j].array_base = shader->output[i].sid; output[j].type = V_SQ_CF_ALLOC_EXPORT_WORD0_SQ_EXPORT_PIXEL; shader->nr_ps_color_exports++; + shader->ps_color_export_mask |= (0xf << (shader->output[i].sid * 4)); if (shader->fs_write_all && (rscreen->b.chip_class >= EVERGREEN)) { for (k = 1; k < max_color_exports; k++) { j++; @@ -3890,6 +3891,7 @@ static int r600_shader_from_tgsi(struct r600_context *rctx, output[j].op = CF_OP_EXPORT; output[j].type = V_SQ_CF_ALLOC_EXPORT_WORD0_SQ_EXPORT_PIXEL; shader->nr_ps_color_exports++; + shader->ps_color_export_mask |= (0xf << (j * 4)); } } } else if (shader->output[i].name == TGSI_SEMANTIC_POSITION) { @@ -3978,6 +3980,7 @@ static int r600_shader_from_tgsi(struct r600_context *rctx, output[j].op = CF_OP_EXPORT; j++; shader->nr_ps_color_exports++; + shader->ps_color_export_mask = 0xf; } noutput = j; diff --git a/src/gallium/drivers/r600/r600_shader.h b/src/gallium/drivers/r600/r600_shader.h index da96688e54..7fca3f455e 100644 --- a/src/gallium/drivers/r600/r600_shader.h +++ b/src/gallium/drivers/r600/r600_shader.h @@ -84,6 +84,7 @@ struct r600_shader { unsignednr_ps_max_color_exports; /* Real number of ps color exports compiled in the bytecode */ unsignednr_ps_color_exports; + unsignedps_color_export_mask; /* bit n is set if the shader writes gl_ClipDistance[n] */ unsignedcc_dist_mask;
[Mesa-dev] [PATCH 5/9] r600/compute: add render cond support.
From: Dave AirlieSet render cond and emit atom. Fixes: KHR-GL45.compute_shader.conditional-dispatching Signed-off-by: Dave Airlie --- src/gallium/drivers/r600/evergreen_compute.c | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/src/gallium/drivers/r600/evergreen_compute.c b/src/gallium/drivers/r600/evergreen_compute.c index 8a771cb8a6..6cb82122b1 100644 --- a/src/gallium/drivers/r600/evergreen_compute.c +++ b/src/gallium/drivers/r600/evergreen_compute.c @@ -577,6 +577,7 @@ static void evergreen_emit_dispatch(struct r600_context *rctx, int i; struct radeon_winsys_cs *cs = rctx->b.gfx.cs; struct r600_pipe_compute *shader = rctx->cs_shader_state.shader; + bool render_cond_bit = rctx->b.render_cond && !rctx->b.render_cond_force_off; unsigned num_waves; unsigned num_pipes = rctx->screen->b.info.r600_max_quad_pipes; unsigned wave_divisor = (16 * num_pipes); @@ -632,14 +633,14 @@ static void evergreen_emit_dispatch(struct r600_context *rctx, lds_size | (num_waves << 14)); if (info->indirect) { - radeon_emit(cs, PKT3C(PKT3_DISPATCH_DIRECT, 3, 0)); + radeon_emit(cs, PKT3C(PKT3_DISPATCH_DIRECT, 3, render_cond_bit)); radeon_emit(cs, indirect_grid[0]); radeon_emit(cs, indirect_grid[1]); radeon_emit(cs, indirect_grid[2]); radeon_emit(cs, 1); } else { /* Dispatch packet */ - radeon_emit(cs, PKT3C(PKT3_DISPATCH_DIRECT, 3, 0)); + radeon_emit(cs, PKT3C(PKT3_DISPATCH_DIRECT, 3, render_cond_bit)); radeon_emit(cs, info->grid[0]); radeon_emit(cs, info->grid[1]); radeon_emit(cs, info->grid[2]); @@ -789,6 +790,8 @@ static void compute_emit_cs(struct r600_context *rctx, rat_mask); } + r600_emit_atom(rctx, >b.render_cond_atom); + /* Emit constant buffer state */ r600_emit_atom(rctx, >constbuf_state[PIPE_SHADER_COMPUTE].atom); -- 2.14.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/9] r600/eg: fix buffer sizing.
From: Dave AirlieFor buffers we want the size in bytes, For images we want it in elements. This fixes: KHR-GL45.shader_storage_buffer_object.advanced-unsizedArrayLength-cs-std430-vec-pad --- src/gallium/drivers/r600/evergreen_state.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/src/gallium/drivers/r600/evergreen_state.c b/src/gallium/drivers/r600/evergreen_state.c index 90f05c06d3..0999cc5de8 100644 --- a/src/gallium/drivers/r600/evergreen_state.c +++ b/src/gallium/drivers/r600/evergreen_state.c @@ -616,6 +616,7 @@ struct eg_buf_res_params { unsigned char swizzle[4]; bool uncached; bool force_swizzle; + bool size_in_bytes; }; static void evergreen_fill_buffer_resource_words(struct r600_context *rctx, @@ -658,7 +659,7 @@ static void evergreen_fill_buffer_resource_words(struct r600_context *rctx, * albeit the amd gpu shader analyser * uses a const buffer to store the element sizes for buffer txq */ - tex_resource_words[4] = params->size / stride; + tex_resource_words[4] = params->size_in_bytes ? params->size : (params->size / stride); tex_resource_words[5] = tex_resource_words[6] = 0; tex_resource_words[7] = S_03001C_TYPE(V_03001C_SQ_TEX_VTX_VALID_BUFFER); @@ -4041,6 +4042,7 @@ static void evergreen_set_shader_buffers(struct pipe_context *ctx, buf_params.swizzle[3] = PIPE_SWIZZLE_W; buf_params.force_swizzle = true; buf_params.uncached = 1; + buf_params.size_in_bytes = true; evergreen_fill_buffer_resource_words(rctx, >b.b, _params, >skip_mip_address_reloc, -- 2.14.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 4/9] r600: fix not-very indirect compute
From: Dave AirlieWe need to get the grid sizes earlier to fill in to the const buffer. Fixes: KHR-GL45.compute_shader.built-in-variables and KHR-GL45.compute_shader.dispatch-indirect Signed-off-by: Dave Airlie --- src/gallium/drivers/r600/evergreen_compute.c | 30 +--- 1 file changed, 18 insertions(+), 12 deletions(-) diff --git a/src/gallium/drivers/r600/evergreen_compute.c b/src/gallium/drivers/r600/evergreen_compute.c index b2c724feb0..8a771cb8a6 100644 --- a/src/gallium/drivers/r600/evergreen_compute.c +++ b/src/gallium/drivers/r600/evergreen_compute.c @@ -571,7 +571,8 @@ static void evergreen_compute_upload_input(struct pipe_context *ctx, } static void evergreen_emit_dispatch(struct r600_context *rctx, - const struct pipe_grid_info *info) + const struct pipe_grid_info *info, + uint32_t indirect_grid[3]) { int i; struct radeon_winsys_cs *cs = rctx->b.gfx.cs; @@ -631,15 +632,11 @@ static void evergreen_emit_dispatch(struct r600_context *rctx, lds_size | (num_waves << 14)); if (info->indirect) { - struct r600_resource *indirect_resource = (struct r600_resource *)info->indirect; - unsigned *data = r600_buffer_map_sync_with_rings(>b, indirect_resource, PIPE_TRANSFER_READ); - if (data) { - radeon_emit(cs, PKT3C(PKT3_DISPATCH_DIRECT, 3, 0)); - radeon_emit(cs, data[0]); - radeon_emit(cs, data[1]); - radeon_emit(cs, data[2]); - radeon_emit(cs, 1); - } + radeon_emit(cs, PKT3C(PKT3_DISPATCH_DIRECT, 3, 0)); + radeon_emit(cs, indirect_grid[0]); + radeon_emit(cs, indirect_grid[1]); + radeon_emit(cs, indirect_grid[2]); + radeon_emit(cs, 1); } else { /* Dispatch packet */ radeon_emit(cs, PKT3C(PKT3_DISPATCH_DIRECT, 3, 0)); @@ -703,6 +700,7 @@ static void compute_emit_cs(struct r600_context *rctx, struct r600_pipe_shader *current; struct r600_shader_atomic combined_atomics[8]; uint8_t atomic_used_mask; + uint32_t indirect_grid[3] = { 0, 0, 0 }; /* make sure that the gfx ring is only one active */ if (radeon_emitted(rctx->b.dma.cs, 0)) { @@ -729,9 +727,17 @@ static void compute_emit_cs(struct r600_context *rctx, bool need_buf_const = current->shader.uses_tex_buffers || current->shader.has_txq_cube_array_z_comp; + if (info->indirect) { + struct r600_resource *indirect_resource = (struct r600_resource *)info->indirect; + unsigned *data = r600_buffer_map_sync_with_rings(>b, indirect_resource, PIPE_TRANSFER_READ); + unsigned offset = info->indirect_offset / 4; + indirect_grid[0] = data[offset]; + indirect_grid[1] = data[offset + 1]; + indirect_grid[2] = data[offset + 2]; + } for (int i = 0; i < 3; i++) { rctx->cs_block_grid_sizes[i] = info->block[i]; - rctx->cs_block_grid_sizes[i + 4] = info->grid[i]; + rctx->cs_block_grid_sizes[i + 4] = info->indirect ? indirect_grid[i] : info->grid[i]; } rctx->cs_block_grid_sizes[3] = rctx->cs_block_grid_sizes[7] = 0; rctx->driver_consts[PIPE_SHADER_COMPUTE].cs_block_grid_size_dirty = true; @@ -802,7 +808,7 @@ static void compute_emit_cs(struct r600_context *rctx, r600_emit_atom(rctx, >cs_shader_state.atom); /* Emit dispatch state and dispatch packet */ - evergreen_emit_dispatch(rctx, info); + evergreen_emit_dispatch(rctx, info, indirect_grid); /* XXX evergreen_flush_emit() hardcodes the CP_COHER_SIZE to 0x */ -- 2.14.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 7/9] r600: fix xfb stream check.
From: Dave AirlieThis fixes: KHR-GL45.enhanced_layouts.xfb_vertex_streams Signed-off-by: Dave Airlie --- src/gallium/drivers/r600/r600_shader.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/gallium/drivers/r600/r600_shader.c b/src/gallium/drivers/r600/r600_shader.c index 32f24c071d..893a71b915 100644 --- a/src/gallium/drivers/r600/r600_shader.c +++ b/src/gallium/drivers/r600/r600_shader.c @@ -2213,7 +2213,7 @@ static int emit_streamout(struct r600_shader_ctx *ctx, struct pipe_stream_output for (i = 0; i < so->num_outputs; i++) { struct r600_bytecode_output output; - if (stream != -1 && stream != so->output[i].output_buffer) + if (stream != -1 && stream != so->output[i].stream) continue; memset(, 0, sizeof(struct r600_bytecode_output)); -- 2.14.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 6/9] r600/compute: only mark buffer/image state dirty for fragment shaders
From: Dave AirlieThe compute emission path always emits this currently, and emitting it on the fragment path breaks the blitter. This fixes gpu hangs in KHR-GL45.compute_shader.resource-texture Signed-off-by: Dave Airlie --- src/gallium/drivers/r600/evergreen_state.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/src/gallium/drivers/r600/evergreen_state.c b/src/gallium/drivers/r600/evergreen_state.c index 0999cc5de8..11e473d604 100644 --- a/src/gallium/drivers/r600/evergreen_state.c +++ b/src/gallium/drivers/r600/evergreen_state.c @@ -4062,7 +4062,8 @@ static void evergreen_set_shader_buffers(struct pipe_context *ctx, r600_mark_atom_dirty(rctx, >cb_misc_state.atom); } - r600_mark_atom_dirty(rctx, >atom); + if (shader == PIPE_SHADER_FRAGMENT) + r600_mark_atom_dirty(rctx, >atom); } static void evergreen_set_shader_images(struct pipe_context *ctx, @@ -4238,7 +4239,8 @@ static void evergreen_set_shader_images(struct pipe_context *ctx, r600_mark_atom_dirty(rctx, >cb_misc_state.atom); } - r600_mark_atom_dirty(rctx, >atom); + if (shader == PIPE_SHADER_FRAGMENT) + r600_mark_atom_dirty(rctx, >atom); } static void evergreen_get_pipe_constant_buffer(struct r600_context *rctx, -- 2.14.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 1/3] st/glsl_to_tgsi: move nir detection earlier - bisected
Am 02.02.2018 10:24, schrieb Timothy Arceri: On 02/02/18 19:26, Dieter Nützel wrote: Hello Tim, _this_ version brake UH, UV, mpv, blender 2.79 (some test files not all). Must be something with the cache file(s). The cache currently needs to be deleted when switching between nir and tgsi. I'm not sure it I should try to avoid this or not ... I guess it will probably save some bug reports so I'll try send a follow up patch. Hi Tim, it is NOT your fault. I tracked it down to Marek's commit commit be973ed21f6e456ebd753f26a99151d9ea6e765c /opt/mesa> git bisect bad be973ed21f6e456ebd753f26a99151d9ea6e765c is the first bad commit commit be973ed21f6e456ebd753f26a99151d9ea6e765c Author: Marek OlšákDate: Tue Jan 30 18:34:25 2018 +0100 radeonsi: load the right number of components for VS inputs and TBOs The supported counts are 1, 2, 4. (3=4) The following snippet loads float, vec2, vec3, and vec4: Before: buffer_load_format_x v9, v4, s[0:3], 0 idxen ; E0002000 8904 buffer_load_format_xyzw v[0:3], v5, s[8:11], 0 idxen ; E00C2000 80020005 s_waitcnt vmcnt(0); BF8C0F70 buffer_load_format_xyzw v[2:5], v6, s[12:15], 0 idxen ; E00C2000 80030206 s_waitcnt vmcnt(0); BF8C0F70 buffer_load_format_xyzw v[5:8], v7, s[4:7], 0 idxen ; E00C2000 80010507 After: buffer_load_format_x v10, v4, s[0:3], 0 idxen ; E0002000 8A04 buffer_load_format_xy v[8:9], v5, s[8:11], 0 idxen; E0042000 80020805 buffer_load_format_xyzw v[0:3], v6, s[12:15], 0 idxen ; E00C2000 80030006 s_waitcnt vmcnt(0); BF8C0F70 buffer_load_format_xyzw v[3:6], v7, s[4:7], 0 idxen ; E00C2000 80010307 Reviewed-by: Samuel Pitoiset :04 04 262b88d9e9f462b32595d6f15eddc0c6be4b997d cf45e1bd87a8a0e12553f6476d51a750e114ea10 M src But can't revert it clean for the time being. Another week, another night,... Cheers, Dieter ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] r600: fix VERTEX_ATTRIB_STRIDE to be 2048
Am 05.02.2018 um 03:04 schrieb Roland Scheidegger: > Am 04.02.2018 um 20:13 schrieb Dave Airlie: >> On 2 February 2018 at 18:02, Roland Scheideggerwrote: >>> Are you sure of that? You only get 11 stride bits to program, and they >>> are in bytes. Therefore I can't see how you could program 2048 (unless >>> the hw would interpet 0 as 2048 but I think stride 0 is valid there?). >>> >> >> Hmm so GL 4.4 defines the minimum for this to be 2048. >> >> Could be a bit of a blocker :-) >> >> Dave. >> > > I'm wondering what the blob did? > I suppose you could always just lie - if the tests only care about the > query and never actually test it you could get away with it... > I mean, before GL 4.4, it was not even queryable and was supposed to > always work no matter the stride, but in reality it would just silently > fail... > I was merely pointing out the limit there was probably intentional (so, > if lying is the answer, I'd suggest a comment at least). (And I didn't > verify what the hw really does with stride 0.) > I guess alternatively you could fix it up in the vertex fetch shader > (so, program stride 1024 in this case and lshift the id-to-fetch by 2 - > albeit I do not actually understand how the VTX_FETCH_VERTEX_DATA and > VTX_FETCH_INSTANCE_DATA work, but nevertheless I suppose it should be > fixable to patch it up somehow). > Forgot to mention, I believe d3d11 also has a limit of 2048 - there is no direct limit on strides, but there is D3D11_REQ_MULTI_ELEMENT_STRUCTURE_SIZE_IN_BYTES (2048 bytes) which probably implies strides of 2048 should be working (and whck tests tend to test such things). So not sure how it's done... FWIW mesa's version calculation considers near zero numeric limits, only extensions. I've recently added the min ubo count of 14 (starting with gl 4.3) but that's about it. It will still happily announce higher versions if you support for instance just 1 draw buffer (the extension doesn't require more, gl starting from 3.0 or so would require 8, gles for some versions 4), or as you noticed smaller max vertex strides. Roland ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] r600: fix VERTEX_ATTRIB_STRIDE to be 2048
Am 04.02.2018 um 20:13 schrieb Dave Airlie: > On 2 February 2018 at 18:02, Roland Scheideggerwrote: >> Are you sure of that? You only get 11 stride bits to program, and they >> are in bytes. Therefore I can't see how you could program 2048 (unless >> the hw would interpet 0 as 2048 but I think stride 0 is valid there?). >> > > Hmm so GL 4.4 defines the minimum for this to be 2048. > > Could be a bit of a blocker :-) > > Dave. > I'm wondering what the blob did? I suppose you could always just lie - if the tests only care about the query and never actually test it you could get away with it... I mean, before GL 4.4, it was not even queryable and was supposed to always work no matter the stride, but in reality it would just silently fail... I was merely pointing out the limit there was probably intentional (so, if lying is the answer, I'd suggest a comment at least). (And I didn't verify what the hw really does with stride 0.) I guess alternatively you could fix it up in the vertex fetch shader (so, program stride 1024 in this case and lshift the id-to-fetch by 2 - albeit I do not actually understand how the VTX_FETCH_VERTEX_DATA and VTX_FETCH_INSTANCE_DATA work, but nevertheless I suppose it should be fixable to patch it up somehow). Roland ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 4/4] r600: partly fix sampleMaskIn value
From: Roland ScheideggerThe hw gives us coverage for pixel, not for individual fragment shader invocations, in case execution isn't per pixel (note eg, unlike cm, actually cannot do "real" minSampleShading, it's either per-pixel or per-fragment, but it doesn't really make a difference here). Also, with msaa disabled, the hw still gives us a mask corresponding to the number of samples, where GL requires this to be 1. Fix this up by masking the sampleMaskIn bits with the bit corresponding to the sampleID, if we know this shader is always executed at per-sample granularity. (In case of a per-sample frequency shader and msaa disabled, the sampleID will always be 0, so this works just fine there.) Fixing this for the minSampleShading case will require a shader key (radeonsi uses the prolog part for this) (for eg, could get away with a single bit, cm would need either more bits depending on sample/invocation ratio, or read the bits from a uniform), unless we'd want to always use a sample mask uniform (which is probably not a good idea, as it would make the ordinary common msaa case slower for no good reason). This fixes some parts of piglit arb_sample_shading-samplemask (needs fixed test), in particular those which use a sampleID, while still failing others as expected. --- src/gallium/drivers/r600/r600_shader.c | 54 ++ 1 file changed, 54 insertions(+) diff --git a/src/gallium/drivers/r600/r600_shader.c b/src/gallium/drivers/r600/r600_shader.c index 1009411c62..8779f166aa 100644 --- a/src/gallium/drivers/r600/r600_shader.c +++ b/src/gallium/drivers/r600/r600_shader.c @@ -1138,6 +1138,11 @@ static int allocate_system_value_inputs(struct r600_shader_ctx *ctx, int gpr_off tgsi_parse_free(); + if (ctx->info.reads_samplemask && + (ctx->info.uses_linear_sample || ctx->info.uses_linear_sample)) { + inputs[1].enabled = true; + } + if (ctx->bc->chip_class >= EVERGREEN) { int num_baryc = 0; /* assign gpr to each interpolator according to priority */ @@ -3503,8 +3508,57 @@ static int r600_shader_from_tgsi(struct r600_context *rctx, r = eg_load_helper_invocation(); if (r) return r; + } + + /* +* XXX this relies on fixed_pt_position_gpr only being present when +* this shader should be executed per sample. Should be the case for now... +*/ + if (ctx.fixed_pt_position_gpr != -1 && ctx.info.reads_samplemask) { + /* +* Fix up sample mask. The hw always gives us coverage mask for +* the pixel. However, for per-sample shading, we need the +* coverage for the shader invocation only. +* Also, with disabled msaa, only the first bit should be set +* (luckily the same fixup works for both problems). +* For now, we can only do it if we know this shader is always +* executed per sample (due to usage of bits in the shader +* forcing per-sample execution). +* If the fb is not multisampled, we'd do unnecessary work but +* it should still be correct. +* It will however do nothing for sample shading according +* to MinSampleShading. +*/ + struct r600_bytecode_alu alu; + int tmp = r600_get_temp(); + assert(ctx.face_gpr != -1); + memset(, 0, sizeof(struct r600_bytecode_alu)); + + alu.op = ALU_OP2_LSHL_INT; + alu.src[0].sel = V_SQ_ALU_SRC_LITERAL; + alu.src[0].value = 0x1; + alu.src[1].sel = ctx.fixed_pt_position_gpr; + alu.src[1].chan = 3; + alu.dst.sel = tmp; + alu.dst.chan = 0; + alu.dst.write = 1; + alu.last = 1; + if ((r = r600_bytecode_add_alu(ctx.bc, ))) + return r; + memset(, 0, sizeof(struct r600_bytecode_alu)); + alu.op = ALU_OP2_AND_INT; + alu.src[0].sel = tmp; + alu.src[1].sel = ctx.face_gpr; + alu.src[1].chan = 2; + alu.dst.sel = ctx.face_gpr; + alu.dst.chan = 2; + alu.dst.write = 1; + alu.last = 1; + if ((r = r600_bytecode_add_alu(ctx.bc, ))) + return r; } + if (ctx.fragcoord_input >= 0) { if (ctx.bc->chip_class == CAYMAN) { for (j = 0 ; j < 4; j++) { -- 2.12.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/4] r600/cm: (trivial) code cleanup for emitting msaa state
From: Roland ScheideggerNo functional change (compile tested only). --- src/gallium/drivers/r600/cayman_msaa.c | 14 ++ src/gallium/drivers/r600/evergreen_state.c | 10 ++ src/gallium/drivers/r600/r600_pipe_common.h | 6 ++ 3 files changed, 14 insertions(+), 16 deletions(-) diff --git a/src/gallium/drivers/r600/cayman_msaa.c b/src/gallium/drivers/r600/cayman_msaa.c index 6bc307a4bc..f97924ac22 100644 --- a/src/gallium/drivers/r600/cayman_msaa.c +++ b/src/gallium/drivers/r600/cayman_msaa.c @@ -141,7 +141,7 @@ void cayman_init_msaa(struct pipe_context *ctx) cayman_get_sample_position(ctx, 16, i, rctx->sample_locations_16x[i]); } -void cayman_emit_msaa_sample_locs(struct radeon_winsys_cs *cs, int nr_samples) +static void cayman_emit_msaa_sample_locs(struct radeon_winsys_cs *cs, int nr_samples) { switch (nr_samples) { default: @@ -202,9 +202,8 @@ void cayman_emit_msaa_sample_locs(struct radeon_winsys_cs *cs, int nr_samples) } } -void cayman_emit_msaa_config(struct radeon_winsys_cs *cs, int nr_samples, -int ps_iter_samples, int overrast_samples, -unsigned sc_mode_cntl_1) +void cayman_emit_msaa_state(struct radeon_winsys_cs *cs, int nr_samples, + int ps_iter_samples, int overrast_samples) { int setup_samples = nr_samples > 1 ? nr_samples : overrast_samples > 1 ? overrast_samples : 0; @@ -216,6 +215,13 @@ void cayman_emit_msaa_config(struct radeon_winsys_cs *cs, int nr_samples, * endcaps. */ unsigned sc_line_cntl = S_028BDC_DX10_DIAMOND_TEST_ENA(1); + unsigned sc_mode_cntl_1 = + EG_S_028A4C_FORCE_EOV_CNTDWN_ENABLE(1) | + EG_S_028A4C_FORCE_EOV_REZ_ENABLE(1); + + if (nr_samples > 1) { + cayman_emit_msaa_sample_locs(cs, nr_samples); + } if (setup_samples > 1) { /* indexed by log2(nr_samples) */ diff --git a/src/gallium/drivers/r600/evergreen_state.c b/src/gallium/drivers/r600/evergreen_state.c index 385d017840..9620fa9e7a 100644 --- a/src/gallium/drivers/r600/evergreen_state.c +++ b/src/gallium/drivers/r600/evergreen_state.c @@ -1948,14 +1948,8 @@ static void evergreen_emit_framebuffer_state(struct r600_context *rctx, struct r if (rctx->b.chip_class == EVERGREEN) { evergreen_emit_msaa_state(rctx, rctx->framebuffer.nr_samples, rctx->ps_iter_samples); } else { - unsigned sc_mode_cntl_1 = - EG_S_028A4C_FORCE_EOV_CNTDWN_ENABLE(1) | - EG_S_028A4C_FORCE_EOV_REZ_ENABLE(1); - - if (rctx->framebuffer.nr_samples > 1) - cayman_emit_msaa_sample_locs(cs, rctx->framebuffer.nr_samples); - cayman_emit_msaa_config(cs, rctx->framebuffer.nr_samples, - rctx->ps_iter_samples, 0, sc_mode_cntl_1); + cayman_emit_msaa_state(cs, rctx->framebuffer.nr_samples, + rctx->ps_iter_samples, 0); } } diff --git a/src/gallium/drivers/r600/r600_pipe_common.h b/src/gallium/drivers/r600/r600_pipe_common.h index 86a20f8639..ee8eb54920 100644 --- a/src/gallium/drivers/r600/r600_pipe_common.h +++ b/src/gallium/drivers/r600/r600_pipe_common.h @@ -799,10 +799,8 @@ extern const unsigned eg_max_dist_4x; void cayman_get_sample_position(struct pipe_context *ctx, unsigned sample_count, unsigned sample_index, float *out_value); void cayman_init_msaa(struct pipe_context *ctx); -void cayman_emit_msaa_sample_locs(struct radeon_winsys_cs *cs, int nr_samples); -void cayman_emit_msaa_config(struct radeon_winsys_cs *cs, int nr_samples, -int ps_iter_samples, int overrast_samples, -unsigned sc_mode_cntl_1); +void cayman_emit_msaa_state(struct radeon_winsys_cs *cs, int nr_samples, + int ps_iter_samples, int overrast_samples); /* Inline helpers. */ -- 2.12.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/4] r600: clean up fragment shader input scan code
From: Roland ScheideggerFor some reason, we were iterating through the code twice (first just for instructions needing barycentrics, then for instructions and input dcls). Move things around slightly so this is no longer necessary. There also was a unnedeed enabling of the fixed_pt_position_gpr - this is only needed if the per-sample interpolation comes from an input, not from an instruction (just move the assert where it belongs) (since the sample id to sample from comes from a tgsi src in this case, and isn't sampleID). Otherwise there should be no functional change. --- src/gallium/drivers/r600/r600_shader.c | 75 +++--- 1 file changed, 23 insertions(+), 52 deletions(-) diff --git a/src/gallium/drivers/r600/r600_shader.c b/src/gallium/drivers/r600/r600_shader.c index 13aa681049..1009411c62 100644 --- a/src/gallium/drivers/r600/r600_shader.c +++ b/src/gallium/drivers/r600/r600_shader.c @@ -,7 +,6 @@ static int allocate_system_value_inputs(struct r600_shader_ctx *ctx, int gpr_off if (inst->Instruction.Opcode == TGSI_OPCODE_INTERP_SAMPLE) { location = TGSI_INTERPOLATE_LOC_CENTER; - inputs[1].enabled = true; /* needs SAMPLEID */ } else if (inst->Instruction.Opcode == TGSI_OPCODE_INTERP_OFFSET) { location = TGSI_INTERPOLATE_LOC_CENTER; /* Needs sample positions, currently those are always available */ @@ -1139,6 +1138,19 @@ static int allocate_system_value_inputs(struct r600_shader_ctx *ctx, int gpr_off tgsi_parse_free(); + if (ctx->bc->chip_class >= EVERGREEN) { + int num_baryc = 0; + /* assign gpr to each interpolator according to priority */ + for (i = 0; i < ARRAY_SIZE(ctx->eg_interpolators); i++) { + if (ctx->eg_interpolators[i].enabled) { + ctx->eg_interpolators[i].ij_index = num_baryc; + num_baryc++; + } + } + num_baryc = (num_baryc + 1) >> 1; + gpr_offset += num_baryc; + } + for (i = 0; i < ARRAY_SIZE(inputs); i++) { boolean enabled = inputs[i].enabled; int *reg = inputs[i].reg; @@ -1165,18 +1177,21 @@ static int allocate_system_value_inputs(struct r600_shader_ctx *ctx, int gpr_off * for evergreen we need to scan the shader to find the number of GPRs we need to * reserve for interpolation and system values * - * we need to know if we are going to emit - * any sample or centroid inputs + * we need to know if we are going to emit any sample or centroid inputs * if perspective and linear are required */ static int evergreen_gpr_count(struct r600_shader_ctx *ctx) { unsigned i; - int num_baryc; - struct tgsi_parse_context parse; memset(>eg_interpolators, 0, sizeof(ctx->eg_interpolators)); + /* +* Could get this information from the shader info. But right now +* we interpolate all declared inputs, whereas the shader info will +* only contain the bits if the inputs are actually used, so it might +* not be safe... +*/ for (i = 0; i < ctx->info.num_inputs; i++) { int k; /* skip position/face/mask/sampleid */ @@ -1193,53 +1208,9 @@ static int evergreen_gpr_count(struct r600_shader_ctx *ctx) ctx->eg_interpolators[k].enabled = TRUE; } - if (tgsi_parse_init(, ctx->tokens) != TGSI_PARSE_OK) { - return 0; - } - - /* need to scan shader for system values and interpolateAtSample/Offset/Centroid */ - while (!tgsi_parse_end_of_tokens()) { - tgsi_parse_token(); - - if (parse.FullToken.Token.Type == TGSI_TOKEN_TYPE_INSTRUCTION) { - const struct tgsi_full_instruction *inst = - if (inst->Instruction.Opcode == TGSI_OPCODE_INTERP_SAMPLE || - inst->Instruction.Opcode == TGSI_OPCODE_INTERP_OFFSET || - inst->Instruction.Opcode == TGSI_OPCODE_INTERP_CENTROID) - { - int interpolate, location, k; - - if (inst->Instruction.Opcode == TGSI_OPCODE_INTERP_SAMPLE) { - location = TGSI_INTERPOLATE_LOC_CENTER; - } else if (inst->Instruction.Opcode == TGSI_OPCODE_INTERP_OFFSET) { - location = TGSI_INTERPOLATE_LOC_CENTER; - } else { - location = TGSI_INTERPOLATE_LOC_CENTROID; - } - -
[Mesa-dev] [PATCH 2/4] mesa: (trivial) remove unused ignore_sample_qualifier_parameter
From: Roland ScheideggerThis parameter for _mesa_get_min_incations_per_fragment() was once used by the intel driver, but it's long gone. --- src/mesa/program/program.c| 11 --- src/mesa/program/program.h| 3 +-- src/mesa/state_tracker/st_atom_msaa.c | 2 +- 3 files changed, 6 insertions(+), 10 deletions(-) diff --git a/src/mesa/program/program.c b/src/mesa/program/program.c index 220efc3539..6aba3cb3f1 100644 --- a/src/mesa/program/program.c +++ b/src/mesa/program/program.c @@ -515,8 +515,7 @@ _mesa_find_free_register(const GLboolean used[], */ GLint _mesa_get_min_invocations_per_fragment(struct gl_context *ctx, - const struct gl_program *prog, - bool ignore_sample_qualifier) + const struct gl_program *prog) { /* From ARB_sample_shading specification: * "Using gl_SampleID in a fragment shader causes the entire shader @@ -534,11 +533,9 @@ _mesa_get_min_invocations_per_fragment(struct gl_context *ctx, * "Use of the "sample" qualifier on a fragment shader input * forces per-sample shading" */ - if (prog->info.fs.uses_sample_qualifier && !ignore_sample_qualifier) - return MAX2(_mesa_geometric_samples(ctx->DrawBuffer), 1); - - if (prog->info.system_values_read & (SYSTEM_BIT_SAMPLE_ID | - SYSTEM_BIT_SAMPLE_POS)) + if (prog->info.fs.uses_sample_qualifier || + (prog->info.system_values_read & (SYSTEM_BIT_SAMPLE_ID | +SYSTEM_BIT_SAMPLE_POS))) return MAX2(_mesa_geometric_samples(ctx->DrawBuffer), 1); else if (ctx->Multisample.SampleShading) return MAX2(ceil(ctx->Multisample.MinSampleShadingValue * diff --git a/src/mesa/program/program.h b/src/mesa/program/program.h index 376da7b2d4..659385f55b 100644 --- a/src/mesa/program/program.h +++ b/src/mesa/program/program.h @@ -108,8 +108,7 @@ _mesa_find_free_register(const GLboolean used[], extern GLint _mesa_get_min_invocations_per_fragment(struct gl_context *ctx, - const struct gl_program *prog, - bool ignore_sample_qualifier); + const struct gl_program *prog); static inline GLuint _mesa_program_enum_to_shader_stage(GLenum v) diff --git a/src/mesa/state_tracker/st_atom_msaa.c b/src/mesa/state_tracker/st_atom_msaa.c index 589e328ac5..556c7c5889 100644 --- a/src/mesa/state_tracker/st_atom_msaa.c +++ b/src/mesa/state_tracker/st_atom_msaa.c @@ -77,5 +77,5 @@ st_update_sample_shading(struct st_context *st) return; cso_set_min_samples(st->cso_context, - _mesa_get_min_invocations_per_fragment(st->ctx, >fp->Base, false)); + _mesa_get_min_invocations_per_fragment(st->ctx, >fp->Base)); } -- 2.12.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] i965: remove unused brw_nir_lower_cs_shared()
This has been unused since 8761a04d0d93. --- src/intel/compiler/brw_nir.c | 8 src/intel/compiler/brw_nir.h | 1 - 2 files changed, 9 deletions(-) diff --git a/src/intel/compiler/brw_nir.c b/src/intel/compiler/brw_nir.c index dbddef0d04..405985f8b6 100644 --- a/src/intel/compiler/brw_nir.c +++ b/src/intel/compiler/brw_nir.c @@ -503,14 +503,6 @@ brw_nir_lower_fs_outputs(nir_shader *nir) nir_lower_io(nir, nir_var_shader_out, type_size_dvec4, 0); } -void -brw_nir_lower_cs_shared(nir_shader *nir) -{ - nir_assign_var_locations(>shared, >num_shared, -type_size_scalar_bytes); - nir_lower_io(nir, nir_var_shared, type_size_scalar_bytes, 0); -} - #define OPT(pass, ...) ({ \ bool this_progress = false; \ NIR_PASS(this_progress, nir, pass, ##__VA_ARGS__); \ diff --git a/src/intel/compiler/brw_nir.h b/src/intel/compiler/brw_nir.h index 3bef99417e..03f52da08e 100644 --- a/src/intel/compiler/brw_nir.h +++ b/src/intel/compiler/brw_nir.h @@ -113,7 +113,6 @@ void brw_nir_lower_vue_outputs(nir_shader *nir, bool is_scalar); void brw_nir_lower_tcs_outputs(nir_shader *nir, const struct brw_vue_map *vue, GLenum tes_primitive_mode); void brw_nir_lower_fs_outputs(nir_shader *nir); -void brw_nir_lower_cs_shared(nir_shader *nir); nir_shader *brw_postprocess_nir(nir_shader *nir, const struct brw_compiler *compiler, -- 2.14.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/3] android: vulkan/util: add dependency on libnativewindow for O and later
Similar to 90dd6e5 ("Android: egl: add dependency on libnativewindow") Fixes the following building error: In file included from out/target/product/x86_64/obj_x86/STATIC_LIBRARIES/libmesa_vulkan_util_intermediates/util/vk_enum_to_str.c:26: external/mesa/include/vulkan/vk_android_native_buffer.h:22:10: fatal error: 'system/window.h' file not found ^ 1 error generated. Cc: "18.0"--- src/vulkan/Android.mk | 4 1 file changed, 4 insertions(+) diff --git a/src/vulkan/Android.mk b/src/vulkan/Android.mk index 6d6df938ab..70b23eae08 100644 --- a/src/vulkan/Android.mk +++ b/src/vulkan/Android.mk @@ -59,5 +59,9 @@ $(LOCAL_GENERATED_SOURCES): $(MESA_TOP)/src/vulkan/util/gen_enum_to_str.py \ LOCAL_EXPORT_C_INCLUDE_DIRS := \ $(intermediates) +ifeq ($(filter $(MESA_ANDROID_MAJOR_VERSION), 4 5 6 7),) +LOCAL_SHARED_LIBRARIES += libnativewindow +endif + include $(MESA_COMMON_MK) include $(BUILD_STATIC_LIBRARY) -- 2.14.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/3] android: anv/extensions: fix generated sources build
Building rules are aligned to automake ones The correct script to build anv_extensions.{c,h} is anv_extensions_gen.py Generation rules for anv_extensions.c requires --out-c option Generation rules for anv_extensions.h were missing Necessary include paths are added to avoid following build errors: cp: cannot stat '.../gen/STATIC_LIBRARIES/libmesa_vulkan_common_intermediates/vulkan/anv_extensions.c': No such file or directory failed to build some targets (01:24 (mm:ss)) In file included from external/mesa/src/intel/vulkan/anv_gem.c:32: external/mesa/src/intel/vulkan/anv_private.h:75:10: fatal error: 'anv_extensions.h' file not found #include "anv_extensions.h" ^~ 1 error generated. In file included from external/mesa/src/intel/vulkan/anv_batch_chain.c:30: external/mesa/src/intel/vulkan/anv_private.h:75:10: fatal error: 'anv_extensions.h' file not found #include "anv_extensions.h" ^~ 1 error generated. Fixes: ca6237e ("android: anv_extensions.c is generated to libmesa_vulkan_common") Cc: "18.0"--- src/intel/Android.vulkan.mk | 16 +--- 1 file changed, 13 insertions(+), 3 deletions(-) diff --git a/src/intel/Android.vulkan.mk b/src/intel/Android.vulkan.mk index 32b4892e17..5c8c947136 100644 --- a/src/intel/Android.vulkan.mk +++ b/src/intel/Android.vulkan.mk @@ -25,7 +25,7 @@ include $(LOCAL_PATH)/Makefile.sources VK_ENTRYPOINTS_SCRIPT := $(MESA_PYTHON2) $(LOCAL_PATH)/vulkan/anv_entrypoints_gen.py -VK_EXTENSIONS_SCRIPT := $(MESA_PYTHON2) $(LOCAL_PATH)/vulkan/anv_extensions.py +VK_EXTENSIONS_SCRIPT := $(MESA_PYTHON2) $(LOCAL_PATH)/vulkan/anv_extensions_gen.py VULKAN_COMMON_INCLUDES := \ $(MESA_TOP)/include \ @@ -82,6 +82,7 @@ ANV_INCLUDES := \ $(VULKAN_COMMON_INCLUDES) \ $(call generated-sources-dir-for,STATIC_LIBRARIES,libmesa_anv_entrypoints,,)/vulkan \ $(call generated-sources-dir-for,STATIC_LIBRARIES,libmesa_nir,,)/nir \ + $(call generated-sources-dir-for,STATIC_LIBRARIES,libmesa_vulkan_common,,)/vulkan \ $(call generated-sources-dir-for,STATIC_LIBRARIES,libmesa_vulkan_util,,)/util # @@ -212,6 +213,7 @@ LOCAL_WHOLE_STATIC_LIBRARIES := \ LOCAL_GENERATED_SOURCES += $(intermediates)/vulkan/anv_entrypoints.c LOCAL_GENERATED_SOURCES += $(intermediates)/vulkan/anv_extensions.c +LOCAL_GENERATED_SOURCES += $(intermediates)/vulkan/anv_extensions.h $(intermediates)/vulkan/anv_entrypoints.c: @mkdir -p $(dir $@) @@ -225,7 +227,14 @@ $(intermediates)/vulkan/anv_extensions.c: $(VK_EXTENSIONS_SCRIPT) \ --xml $(MESA_TOP)/src/vulkan/registry/vk.xml \ --xml $(MESA_TOP)/src/vulkan/registry/vk_android_native_buffer.xml \ - --out $@ + --out-c $@ + +$(intermediates)/vulkan/anv_extensions.h: + @mkdir -p $(dir $@) + $(VK_EXTENSIONS_SCRIPT) \ + --xml $(MESA_TOP)/src/vulkan/registry/vk.xml \ + --xml $(MESA_TOP)/src/vulkan/registry/vk_android_native_buffer.xml \ + --out-h $@ LOCAL_SHARED_LIBRARIES := libdrm @@ -252,7 +261,8 @@ LOCAL_SRC_FILES := \ LOCAL_C_INCLUDES := \ $(VULKAN_COMMON_INCLUDES) \ - $(call generated-sources-dir-for,STATIC_LIBRARIES,libmesa_anv_entrypoints,,)/vulkan + $(call generated-sources-dir-for,STATIC_LIBRARIES,libmesa_anv_entrypoints,,)/vulkan \ + $(call generated-sources-dir-for,STATIC_LIBRARIES,libmesa_vulkan_common,,)/vulkan LOCAL_EXPORT_C_INCLUDE_DIRS := $(MESA_TOP)/src/intel/vulkan -- 2.14.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/3] android: anv: add dependency on libnativewindow for O and later
Similar to 90dd6e5 ("Android: egl: add dependency on libnativewindow") Fixes the following building errors: In file included from external/mesa/src/intel/vulkan/gen7_cmd_buffer.c:30: In file included from external/mesa/src/intel/vulkan/anv_private.h:72: external/mesa/include/vulkan/vk_android_native_buffer.h:22:10: fatal error: 'system/window.h' file not found ^ 1 error generated. ... In file included from external/mesa/src/intel/vulkan/anv_gem.c:32: In file included from external/mesa/src/intel/vulkan/anv_private.h:72: external/mesa/include/vulkan/vk_android_native_buffer.h:22:10: fatal error: 'system/window.h' file not found ^ 1 error generated. Cc: "18.0"--- src/intel/Android.vulkan.mk | 28 1 file changed, 28 insertions(+) diff --git a/src/intel/Android.vulkan.mk b/src/intel/Android.vulkan.mk index 5c8c947136..3a6870097b 100644 --- a/src/intel/Android.vulkan.mk +++ b/src/intel/Android.vulkan.mk @@ -102,6 +102,10 @@ LOCAL_WHOLE_STATIC_LIBRARIES := libmesa_anv_entrypoints libmesa_genxml LOCAL_SHARED_LIBRARIES := libdrm +ifeq ($(filter $(MESA_ANDROID_MAJOR_VERSION), 4 5 6 7),) +LOCAL_SHARED_LIBRARIES += libnativewindow +endif + include $(MESA_COMMON_MK) include $(BUILD_STATIC_LIBRARY) @@ -122,6 +126,10 @@ LOCAL_WHOLE_STATIC_LIBRARIES := libmesa_anv_entrypoints libmesa_genxml LOCAL_SHARED_LIBRARIES := libdrm +ifeq ($(filter $(MESA_ANDROID_MAJOR_VERSION), 4 5 6 7),) +LOCAL_SHARED_LIBRARIES += libnativewindow +endif + include $(MESA_COMMON_MK) include $(BUILD_STATIC_LIBRARY) @@ -142,6 +150,10 @@ LOCAL_WHOLE_STATIC_LIBRARIES := libmesa_anv_entrypoints libmesa_genxml LOCAL_SHARED_LIBRARIES := libdrm +ifeq ($(filter $(MESA_ANDROID_MAJOR_VERSION), 4 5 6 7),) +LOCAL_SHARED_LIBRARIES += libnativewindow +endif + include $(MESA_COMMON_MK) include $(BUILD_STATIC_LIBRARY) @@ -162,6 +174,10 @@ LOCAL_WHOLE_STATIC_LIBRARIES := libmesa_anv_entrypoints libmesa_genxml LOCAL_SHARED_LIBRARIES := libdrm +ifeq ($(filter $(MESA_ANDROID_MAJOR_VERSION), 4 5 6 7),) +LOCAL_SHARED_LIBRARIES += libnativewindow +endif + include $(MESA_COMMON_MK) include $(BUILD_STATIC_LIBRARY) @@ -182,6 +198,10 @@ LOCAL_WHOLE_STATIC_LIBRARIES := libmesa_anv_entrypoints libmesa_genxml LOCAL_SHARED_LIBRARIES := libdrm +ifeq ($(filter $(MESA_ANDROID_MAJOR_VERSION), 4 5 6 7),) +LOCAL_SHARED_LIBRARIES += libnativewindow +endif + include $(MESA_COMMON_MK) include $(BUILD_STATIC_LIBRARY) @@ -238,6 +258,10 @@ $(intermediates)/vulkan/anv_extensions.h: LOCAL_SHARED_LIBRARIES := libdrm +ifeq ($(filter $(MESA_ANDROID_MAJOR_VERSION), 4 5 6 7),) +LOCAL_SHARED_LIBRARIES += libnativewindow +endif + include $(MESA_COMMON_MK) include $(BUILD_STATIC_LIBRARY) @@ -285,5 +309,9 @@ LOCAL_WHOLE_STATIC_LIBRARIES := \ LOCAL_SHARED_LIBRARIES := libdrm libz libsync liblog +ifeq ($(filter $(MESA_ANDROID_MAJOR_VERSION), 4 5 6 7),) +LOCAL_SHARED_LIBRARIES += libnativewindow +endif + include $(MESA_COMMON_MK) include $(BUILD_SHARED_LIBRARY) -- 2.14.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] android: fixes for anv build
Hi, I'm sending a series of fixed for anv build [PATCH 1/3] android: anv/extensions: fix generated sources build [PATCH 2/3] android: anv: add dependency on libnativewindow for O and [PATCH 3/3] android: vulkan/util: add dependency on libnativewindow Since anv now implements vulkan HAL, the build was tested by installing anv vulkan.$(TARGET_BOARD_PLATFORM) module with oreo-x86 Patches are also proposed as candidates for 18.0 mesa-stable branch, for mesa 18.0.0 release Kind regards Mauro ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] mesa: Fix VAO buffer object tracking.
From: Mathias FröhlichWhen changing the attribute binding in the VAO we also need to account for getting rid of non vbo bits from VertexAttribBufferMask. Signed-off-by: Mathias Fröhlich --- src/mesa/main/varray.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/src/mesa/main/varray.c b/src/mesa/main/varray.c index 81b8fbe8ca..2fd9de630f 100644 --- a/src/mesa/main/varray.c +++ b/src/mesa/main/varray.c @@ -142,6 +142,8 @@ vertex_attrib_binding(struct gl_context *ctx, if (_mesa_is_bufferobj(vao->BufferBinding[bindingIndex].BufferObj)) vao->VertexAttribBufferMask |= array_bit; + else + vao->VertexAttribBufferMask &= ~array_bit; FLUSH_VERTICES(ctx, _NEW_ARRAY); -- 2.14.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 0/2] Fix and tweak to the VAO
From: Mathias FröhlichHi, Those two patches fix a bug in tracking the VAO vertex buffer object state and reduce the amount of possible unneeded updates of the dependent gl_vertex_array array. Both changes are as well in preparation to more internal use of vertex array objects. Please review! best Mathias Mathias Fröhlich (2): mesa: Fix VAO buffer object tracking. mesa: Only update enabled VAO gl_vertex_array entries. src/mesa/main/varray.c | 10 ++ src/mesa/main/varray.h | 26 +++--- 2 files changed, 21 insertions(+), 15 deletions(-) -- 2.14.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] mesa: Only update enabled VAO gl_vertex_array entries.
From: Mathias FröhlichInstead of updating all modified gl_vertex_array_object::_VertexArray entried just update those ones that are modified and enabled. Also release buffer object from the _VertexArray that belong to disabled attributes. Signed-off-by: Mathias Fröhlich --- src/mesa/main/varray.c | 8 src/mesa/main/varray.h | 26 +++--- 2 files changed, 19 insertions(+), 15 deletions(-) diff --git a/src/mesa/main/varray.c b/src/mesa/main/varray.c index 2fd9de630f..a2d1d74798 100644 --- a/src/mesa/main/varray.c +++ b/src/mesa/main/varray.c @@ -152,7 +152,7 @@ vertex_attrib_binding(struct gl_context *ctx, array->BufferBindingIndex = bindingIndex; - vao->NewArrays |= array_bit; + vao->NewArrays |= vao->_Enabled & array_bit; } } @@ -187,7 +187,7 @@ _mesa_bind_vertex_buffer(struct gl_context *ctx, else vao->VertexAttribBufferMask |= binding->_BoundArrays; - vao->NewArrays |= binding->_BoundArrays; + vao->NewArrays |= vao->_Enabled & binding->_BoundArrays; } } @@ -208,7 +208,7 @@ vertex_binding_divisor(struct gl_context *ctx, if (binding->InstanceDivisor != divisor) { FLUSH_VERTICES(ctx, _NEW_ARRAY); binding->InstanceDivisor = divisor; - vao->NewArrays |= binding->_BoundArrays; + vao->NewArrays |= vao->_Enabled & binding->_BoundArrays; } } @@ -318,7 +318,7 @@ _mesa_update_array_format(struct gl_context *ctx, array->RelativeOffset = relativeOffset; array->_ElementSize = elementSize; - vao->NewArrays |= VERT_BIT(attrib); + vao->NewArrays |= vao->_Enabled & VERT_BIT(attrib); ctx->NewState |= _NEW_ARRAY; } diff --git a/src/mesa/main/varray.h b/src/mesa/main/varray.h index fe7eb81631..6e2649830f 100644 --- a/src/mesa/main/varray.h +++ b/src/mesa/main/varray.h @@ -58,17 +58,21 @@ _mesa_update_vertex_array(struct gl_context *ctx, const struct gl_array_attributes *attribs, const struct gl_vertex_buffer_binding *binding) { - dst->Size = attribs->Size; - dst->Type = attribs->Type; - dst->Format = attribs->Format; - dst->StrideB = binding->Stride; - dst->Ptr = _mesa_vertex_attrib_address(attribs, binding); - dst->Normalized = attribs->Normalized; - dst->Integer = attribs->Integer; - dst->Doubles = attribs->Doubles; - dst->InstanceDivisor = binding->InstanceDivisor; - dst->_ElementSize = attribs->_ElementSize; - _mesa_reference_buffer_object(ctx, >BufferObj, binding->BufferObj); + if (attribs->Enabled) { + dst->Size = attribs->Size; + dst->Type = attribs->Type; + dst->Format = attribs->Format; + dst->StrideB = binding->Stride; + dst->Ptr = _mesa_vertex_attrib_address(attribs, binding); + dst->Normalized = attribs->Normalized; + dst->Integer = attribs->Integer; + dst->Doubles = attribs->Doubles; + dst->InstanceDivisor = binding->InstanceDivisor; + dst->_ElementSize = attribs->_ElementSize; + _mesa_reference_buffer_object(ctx, >BufferObj, binding->BufferObj); + } else { + _mesa_reference_buffer_object(ctx, >BufferObj, NULL); + } } -- 2.14.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] winsys/amdgpu: allow non page-aligned size bo creation from pointer
Fix INVALID_OPERATION caused by BufferData with target EXTERNAL_VIRTUAL_MEMORY_BUFFER_AMD when the buffer size is not page aligned. --- src/gallium/winsys/amdgpu/drm/amdgpu_bo.c | 11 +++ 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c b/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c index 5d565ff..ba48cad 100644 --- a/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c +++ b/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c @@ -1388,19 +1388,22 @@ static struct pb_buffer *amdgpu_bo_from_ptr(struct radeon_winsys *rws, struct amdgpu_winsys_bo *bo; uint64_t va; amdgpu_va_handle va_handle; +/* Avoid failure when the size is not page aligned */ +uint64_t aligned_size = align64(size, ws->info.gart_page_size); bo = CALLOC_STRUCT(amdgpu_winsys_bo); if (!bo) return NULL; -if (amdgpu_create_bo_from_user_mem(ws->dev, pointer, size, _handle)) +if (amdgpu_create_bo_from_user_mem(ws->dev, pointer, + aligned_size, _handle)) goto error; if (amdgpu_va_range_alloc(ws->dev, amdgpu_gpu_va_range_general, - size, 1 << 12, 0, , _handle, 0)) + aligned_size, 1 << 12, 0, , _handle, 0)) goto error_va_alloc; -if (amdgpu_bo_va_op(buf_handle, 0, size, va, 0, AMDGPU_VA_OP_MAP)) +if (amdgpu_bo_va_op(buf_handle, 0, aligned_size, va, 0, AMDGPU_VA_OP_MAP)) goto error_va_map; /* Initialize it. */ @@ -1416,7 +1419,7 @@ static struct pb_buffer *amdgpu_bo_from_ptr(struct radeon_winsys *rws, bo->initial_domain = RADEON_DOMAIN_GTT; bo->unique_id = __sync_fetch_and_add(>next_bo_unique_id, 1); -ws->allocated_gtt += align64(bo->base.size, ws->info.gart_page_size); +ws->allocated_gtt += aligned_size; amdgpu_add_buffer_to_global_list(bo); -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965/nir: do int64 lowering before optimization
On Mon, Dec 11, 2017 at 11:01 AM, Jason Ekstrandwrote: > On Mon, Dec 11, 2017 at 12:55 AM, Iago Toral wrote: >> >> This didn't get any reviews yet. Any takers? >> >> On Fri, 2017-12-01 at 13:46 +0100, Iago Toral Quiroga wrote: >> > Otherwise loop unrolling will fail to see the actual cost of >> > the unrolling operations when the loop body contains 64-bit integer >> > instructions, and very specially when the divmod64 lowering applies, >> > since its lowering is quite expensive. >> > >> > Without this change, some in-development CTS tests for int64 >> > get stuck forever trying to register allocate a shader with >> > over 50K SSA values. The large number of SSA values is the result >> > of NIR first unrolling multiple seemingly simple loops that involve >> > int64 instructions, only to then lower these instructions to produce >> > a massive pile of code (due to the divmod64 lowering in the unrolled >> > instructions). >> > >> > With this change, loop unrolling will see the loops with the int64 >> > code already lowered and will realize that it is too expensive to >> > unroll. > > > Hrm... I'm not quite sure what I think of this. I put it after nir_optimize > because I wanted opt_algebraic to be able to work it's magic and hopefully > remove a bunch of int64 ops before we lower them. In particular, we have > optimizations to remove integer division and replace it with shifts. > However, loop unrolling does need to happen before lower_indirect_derefs so > that lower_indirect_derefs will do as little work as possible. > > This is a bit of a pickle... I don't really want to add a third > brw_nir_optimize call. It probably wouldn't be the end of the world but it > does add compile time. > > One crazy idea which I don't think I like would be to have a quick pass that > walks the IR and sees if there are any 64-bit SSA values. If it does, we > run brw_nir_optimize without loop unrolling then 64-bit lowering and then we > go into the normal brw_nir_optimize. > > --Jason Why don't we just add some sort of backend-specific code-size metric to the loop unrolling, rather than just counting NIR instructions? i.e. something like a num_assembly_instructions(nir_instr *) function pointer in nir_shader_compiler_options. The root of the problem is that that different NIR instructions can turn into vastly different numbers of assembly instructions, but we really care about the latter, so the cutoff isn't doing its job of avoiding code-size blowup. As far as I'm aware, this is what most other compilers (e.g. LLVM) do to solve this problem. > >> >> > --- >> > src/intel/compiler/brw_nir.c | 8 >> > 1 file changed, 4 insertions(+), 4 deletions(-) >> > >> > diff --git a/src/intel/compiler/brw_nir.c >> > b/src/intel/compiler/brw_nir.c >> > index 8f3f77f89a..ef12cdfff8 100644 >> > --- a/src/intel/compiler/brw_nir.c >> > +++ b/src/intel/compiler/brw_nir.c >> > @@ -636,6 +636,10 @@ brw_preprocess_nir(const struct brw_compiler >> > *compiler, nir_shader *nir) >> > >> > OPT(nir_split_var_copies); >> > >> > + nir_lower_int64(nir, nir_lower_imul64 | >> > +nir_lower_isign64 | >> > +nir_lower_divmod64); >> > + >> > nir = brw_nir_optimize(nir, compiler, is_scalar); >> > >> > if (is_scalar) { >> > @@ -663,10 +667,6 @@ brw_preprocess_nir(const struct brw_compiler >> > *compiler, nir_shader *nir) >> >brw_nir_no_indirect_mask(compiler, nir->info.stage); >> > nir_lower_indirect_derefs(nir, indirect_mask); >> > >> > - nir_lower_int64(nir, nir_lower_imul64 | >> > -nir_lower_isign64 | >> > -nir_lower_divmod64); >> > - >> > /* Get rid of split copies */ >> > nir = brw_nir_optimize(nir, compiler, is_scalar); >> > >> ___ >> mesa-dev mailing list >> mesa-dev@lists.freedesktop.org >> https://lists.freedesktop.org/mailman/listinfo/mesa-dev > > > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] r600: fix VERTEX_ATTRIB_STRIDE to be 2048
On 2 February 2018 at 18:02, Roland Scheideggerwrote: > Are you sure of that? You only get 11 stride bits to program, and they > are in bytes. Therefore I can't see how you could program 2048 (unless > the hw would interpet 0 as 2048 but I think stride 0 is valid there?). > Hmm so GL 4.4 defines the minimum for this to be 2048. Could be a bit of a blocker :-) Dave. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/3] nv50, nvc0: mark ABGR format as displayable instead of ARGB format
This matches the hardware's capabilities. Signed-off-by: Ilia Mirkin--- src/gallium/drivers/nouveau/nv50/nv50_formats.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/gallium/drivers/nouveau/nv50/nv50_formats.c b/src/gallium/drivers/nouveau/nv50/nv50_formats.c index 706c34f0dbb..fc5deac2d58 100644 --- a/src/gallium/drivers/nouveau/nv50/nv50_formats.c +++ b/src/gallium/drivers/nouveau/nv50/nv50_formats.c @@ -152,8 +152,8 @@ const struct nv50_format nv50_format_table[PIPE_FORMAT_COUNT] = F3(A, B4G4R4X4_UNORM, NONE, B, G, R, xx, UNORM, A4B4G4R4, T), F3(A, R9G9B9E5_FLOAT, NONE, R, G, B, xx, FLOAT, E5B9G9R9_SHAREDEXP, T), - C4(A, R10G10B10A2_UNORM, RGB10_A2_UNORM, R, G, B, A, UNORM, A2B10G10R10, IB), - C4(A, B10G10R10A2_UNORM, BGR10_A2_UNORM, B, G, R, A, UNORM, A2B10G10R10, TD), + C4(A, R10G10B10A2_UNORM, RGB10_A2_UNORM, R, G, B, A, UNORM, A2B10G10R10, TD), + C4(A, B10G10R10A2_UNORM, BGR10_A2_UNORM, B, G, R, A, UNORM, A2B10G10R10, IB), C4(A, R10G10B10A2_SNORM, NONE, R, G, B, A, SNORM, A2B10G10R10, T), C4(A, B10G10R10A2_SNORM, NONE, B, G, R, A, SNORM, A2B10G10R10, T), C4(A, R10G10B10A2_UINT, RGB10_A2_UINT, R, G, B, A, UINT, A2B10G10R10, TR), -- 2.13.6 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/3] mesa: add xbgr support adjacent to xrgb
Signed-off-by: Ilia Mirkin--- One might have split this up into multiple patches, but it's just very repetitive and similar code. include/GL/internal/dri_interface.h | 2 ++ src/gallium/state_trackers/dri/dri2.c | 36 +++ src/gallium/state_trackers/dri/dri_drawable.c | 3 +++ src/gallium/state_trackers/dri/dri_screen.c | 17 - src/gallium/state_trackers/vdpau/device.c | 2 +- src/loader/loader_dri3_helper.c | 4 +++ src/mesa/drivers/dri/common/utils.c | 10 src/mesa/main/framebuffer.c | 4 ++- src/mesa/state_tracker/st_cb_fbo.c| 2 ++ 9 files changed, 77 insertions(+), 3 deletions(-) diff --git a/include/GL/internal/dri_interface.h b/include/GL/internal/dri_interface.h index 34a5c9fb01a..dfee806885b 100644 --- a/include/GL/internal/dri_interface.h +++ b/include/GL/internal/dri_interface.h @@ -1227,6 +1227,8 @@ struct __DRIdri2ExtensionRec { #define __DRI_IMAGE_FORMAT_R16 0x100d #define __DRI_IMAGE_FORMAT_GR1616 0x100e #define __DRI_IMAGE_FORMAT_YUYV 0x100f +#define __DRI_IMAGE_FORMAT_XBGR2101010 0x1010 +#define __DRI_IMAGE_FORMAT_ABGR2101010 0x1011 #define __DRI_IMAGE_USE_SHARE 0x0001 #define __DRI_IMAGE_USE_SCANOUT0x0002 diff --git a/src/gallium/state_trackers/dri/dri2.c b/src/gallium/state_trackers/dri/dri2.c index 415002d2cd0..2a3a2a805b4 100644 --- a/src/gallium/state_trackers/dri/dri2.c +++ b/src/gallium/state_trackers/dri/dri2.c @@ -57,6 +57,8 @@ static const int fourcc_formats[] = { __DRI_IMAGE_FOURCC_ARGB2101010, __DRI_IMAGE_FOURCC_XRGB2101010, + __DRI_IMAGE_FOURCC_ABGR2101010, + __DRI_IMAGE_FOURCC_XBGR2101010, __DRI_IMAGE_FOURCC_ARGB, __DRI_IMAGE_FOURCC_ABGR, __DRI_IMAGE_FOURCC_SARGB, @@ -115,6 +117,14 @@ static int convert_fourcc(int format, int *dri_components_p) format = __DRI_IMAGE_FORMAT_XRGB2101010; dri_components = __DRI_IMAGE_COMPONENTS_RGB; break; + case __DRI_IMAGE_FOURCC_ABGR2101010: + format = __DRI_IMAGE_FORMAT_ABGR2101010; + dri_components = __DRI_IMAGE_COMPONENTS_RGBA; + break; + case __DRI_IMAGE_FOURCC_XBGR2101010: + format = __DRI_IMAGE_FORMAT_XBGR2101010; + dri_components = __DRI_IMAGE_COMPONENTS_RGB; + break; case __DRI_IMAGE_FOURCC_R8: format = __DRI_IMAGE_FORMAT_R8; dri_components = __DRI_IMAGE_COMPONENTS_R; @@ -186,6 +196,12 @@ static int convert_to_fourcc(int format) case __DRI_IMAGE_FORMAT_XRGB2101010: format = __DRI_IMAGE_FOURCC_XRGB2101010; break; + case __DRI_IMAGE_FORMAT_ABGR2101010: + format = __DRI_IMAGE_FOURCC_ABGR2101010; + break; + case __DRI_IMAGE_FORMAT_XBGR2101010: + format = __DRI_IMAGE_FOURCC_XBGR2101010; + break; case __DRI_IMAGE_FORMAT_R8: format = __DRI_IMAGE_FOURCC_R8; break; @@ -224,6 +240,12 @@ static enum pipe_format dri2_format_to_pipe_format (int format) case __DRI_IMAGE_FORMAT_ARGB2101010: pf = PIPE_FORMAT_B10G10R10A2_UNORM; break; + case __DRI_IMAGE_FORMAT_XBGR2101010: + pf = PIPE_FORMAT_R10G10B10X2_UNORM; + break; + case __DRI_IMAGE_FORMAT_ABGR2101010: + pf = PIPE_FORMAT_R10G10B10A2_UNORM; + break; case __DRI_IMAGE_FORMAT_R8: pf = PIPE_FORMAT_R8_UNORM; break; @@ -288,6 +310,12 @@ static enum pipe_format fourcc_to_pipe_format(int fourcc) case __DRI_IMAGE_FOURCC_XRGB2101010: pf = PIPE_FORMAT_B10G10R10X2_UNORM; break; + case __DRI_IMAGE_FOURCC_ABGR2101010: + pf = PIPE_FORMAT_R10G10B10A2_UNORM; + break; + case __DRI_IMAGE_FOURCC_XBGR2101010: + pf = PIPE_FORMAT_R10G10B10X2_UNORM; + break; case __DRI_IMAGE_FOURCC_NV12: pf = PIPE_FORMAT_NV12; @@ -406,10 +434,12 @@ dri2_drawable_get_buffers(struct dri_drawable *drawable, */ switch(format) { case PIPE_FORMAT_B10G10R10A2_UNORM: + case PIPE_FORMAT_R10G10B10A2_UNORM: case PIPE_FORMAT_BGRA_UNORM: case PIPE_FORMAT_RGBA_UNORM: depth = 32; break; + case PIPE_FORMAT_R10G10B10X2_UNORM: case PIPE_FORMAT_B10G10R10X2_UNORM: depth = 30; break; @@ -502,6 +532,12 @@ dri_image_drawable_get_buffers(struct dri_drawable *drawable, case PIPE_FORMAT_B10G10R10A2_UNORM: image_format = __DRI_IMAGE_FORMAT_ARGB2101010; break; + case PIPE_FORMAT_R10G10B10X2_UNORM: + image_format = __DRI_IMAGE_FORMAT_XBGR2101010; + break; + case PIPE_FORMAT_R10G10B10A2_UNORM: + image_format = __DRI_IMAGE_FORMAT_ABGR2101010; + break; default: image_format = __DRI_IMAGE_FORMAT_NONE; break; diff --git a/src/gallium/state_trackers/dri/dri_drawable.c b/src/gallium/state_trackers/dri/dri_drawable.c index a5999be574a..e5a7537e473 100644 ---
[Mesa-dev] [PATCH 2/3] st/dri: only expose config formats that are display targets
In the case of NVIDIA hardware, ABGR is displayable but ARGB is not. Only advertise the one set in the visuals list. Signed-off-by: Ilia Mirkin--- Not sure if this is the right thing, esp for a PRIME-type setup. However for the common single-GPU case, it does seem right. src/gallium/state_trackers/dri/dri_screen.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/gallium/state_trackers/dri/dri_screen.c b/src/gallium/state_trackers/dri/dri_screen.c index bd0925b9055..aaee9870776 100644 --- a/src/gallium/state_trackers/dri/dri_screen.c +++ b/src/gallium/state_trackers/dri/dri_screen.c @@ -249,7 +249,8 @@ dri_fill_in_modes(struct dri_screen *screen) if (!p_screen->is_format_supported(p_screen, pipe_formats[format], PIPE_TEXTURE_2D, 0, - PIPE_BIND_RENDER_TARGET)) + PIPE_BIND_RENDER_TARGET | + PIPE_BIND_DISPLAY_TARGET)) continue; for (i = 1; i <= msaa_samples_max; i++) { -- 2.13.6 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965/nir: do int64 lowering before optimization
On Wed, Dec 13, 2017 at 11:21 PM, Iago Toralwrote: > On Tue, 2017-12-12 at 08:20 +0100, Iago Toral wrote: > > On Mon, 2017-12-11 at 08:01 -0800, Jason Ekstrand wrote: > > On Mon, Dec 11, 2017 at 12:55 AM, Iago Toral wrote: > > This didn't get any reviews yet. Any takers? > > On Fri, 2017-12-01 at 13:46 +0100, Iago Toral Quiroga wrote: >> Otherwise loop unrolling will fail to see the actual cost of >> the unrolling operations when the loop body contains 64-bit integer >> instructions, and very specially when the divmod64 lowering applies, >> since its lowering is quite expensive. >> >> Without this change, some in-development CTS tests for int64 >> get stuck forever trying to register allocate a shader with >> over 50K SSA values. The large number of SSA values is the result >> of NIR first unrolling multiple seemingly simple loops that involve >> int64 instructions, only to then lower these instructions to produce >> a massive pile of code (due to the divmod64 lowering in the unrolled >> instructions). >> >> With this change, loop unrolling will see the loops with the int64 >> code already lowered and will realize that it is too expensive to >> unroll. > > > Hrm... I'm not quite sure what I think of this. I put it after nir_optimize > because I wanted opt_algebraic to be able to work it's magic and hopefully > remove a bunch of int64 ops before we lower them. In particular, we have > optimizations to remove integer division and replace it with shifts. > However, loop unrolling does need to happen before lower_indirect_derefs so > that lower_indirect_derefs will do as little work as possible. > > This is a bit of a pickle... I don't really want to add a third > brw_nir_optimize call. It probably wouldn't be the end of the world but it > does add compile time. > > One crazy idea which I don't think I like would be to have a quick pass that > walks the IR and sees if there are any 64-bit SSA values. If it does, we > run brw_nir_optimize without loop unrolling then 64-bit lowering and then we > go into the normal brw_nir_optimize. > > > With the constraints you mention above, I am not sure that we have many more > options... what if we always run opt_algebraic first followed by int64 > lowering before the first nir_optimize? That would only add an extra > opt_algebraic instead of a full nir_optimize. Would that be better than > adding that 64-bit SSA scan pre-pass? > > > We still need to make a decision for this, does my proposal sound better > than than the other options on the table? If not I guess we should go with > the 64-bit SSA scan pre-pass. Realized I never responded to this -- sorry. Yes, I think your proposal sounds good. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 104928] libglvnd_1.0.0 disables amdgpu direct rendering
https://bugs.freedesktop.org/show_bug.cgi?id=104928 Christian Königchanged: What|Removed |Added Status|REOPENED|RESOLVED Resolution|--- |INVALID --- Comment #3 from Christian König --- As Timo already explained the oibaf ppa doesn't support lbglvnd. You can't blame anybody if you install packages which doesn't include the necessary support. -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 104928] libglvnd_1.0.0 disables amdgpu direct rendering
https://bugs.freedesktop.org/show_bug.cgi?id=104928 fin4...@hotmail.com changed: What|Removed |Added Resolution|INVALID |--- Status|RESOLVED|REOPENED --- Comment #2 from fin4...@hotmail.com --- Debian Buster and Sid Mesa uses libglvnd, in other words the libegl1 package. Thanks for that bomb in Debian. I did remove xorg and instaled it again, same problem with Buster and Sid, no hardware graphics acceleration. Siduction live uses Mesa 17.1.1 and there is no this bug. http://metadata.ftp-master.debian.org/changelogs/main/libg/libglvnd/libglvnd_1.0.0-2_changelog -- Timo AaltonenFri, 26 Jan 2018 13:48:45 +0200 -- Timo Aaltonen Sat, 03 Dec 2016 02:09:57 +0200 -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/8] mesa: Track position/generic0 aliasing in the VAO.
Hi, > On 1 February 2018 at 07:32,wrote: > Feel free to use C99 designated initializers. All supported compilers > understand them. > Even MSVC 2013 Update 4 ;-) Next time :-) best Mathias ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 104928] libglvnd_1.0.0 disables amdgpu direct rendering
https://bugs.freedesktop.org/show_bug.cgi?id=104928 Timo Aaltonenchanged: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |INVALID --- Comment #1 from Timo Aaltonen --- mesa from oibaf ppa does not use libglvnd -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 104926] swrast: Mesa 17.3.3 produces: HW cursor for format 875713089 not supported
https://bugs.freedesktop.org/show_bug.cgi?id=104926 pipepchanged: What|Removed |Added CC||a...@gmx.ch -- You are receiving this mail because: You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev