[Mesa-dev] [PATCH 2/2] glsl: rework misleading block layout code
From the ARB_uniform_buffer_object spec: ""shared" uniform blocks, the default layout, ..." This doesn't fix anything as the default layout is already applied at this point but fixes the misleading code/comment. --- src/compiler/glsl/ast_to_hir.cpp | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/src/compiler/glsl/ast_to_hir.cpp b/src/compiler/glsl/ast_to_hir.cpp index 7de164c..4916b159 100644 --- a/src/compiler/glsl/ast_to_hir.cpp +++ b/src/compiler/glsl/ast_to_hir.cpp @@ -7678,16 +7678,16 @@ ast_interface_block::hir(exec_list *instructions, this->block_name); enum glsl_interface_packing packing; - if (this->layout.flags.q.shared) { - packing = GLSL_INTERFACE_PACKING_SHARED; + if (this->layout.flags.q.std140) { + packing = GLSL_INTERFACE_PACKING_STD140; } else if (this->layout.flags.q.packed) { packing = GLSL_INTERFACE_PACKING_PACKED; } else if (this->layout.flags.q.std430) { packing = GLSL_INTERFACE_PACKING_STD430; } else { - /* The default layout is std140. + /* The default layout is shared. */ - packing = GLSL_INTERFACE_PACKING_STD140; + packing = GLSL_INTERFACE_PACKING_SHARED; } ir_variable_mode var_mode; -- 2.9.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] glsl: remove placeholder comment
This was added in 2d03f48a65a666 and seems like it was intended as a TODO comment in a function stub rather than a useful code comment. --- src/compiler/glsl/ast_to_hir.cpp | 4 1 file changed, 4 deletions(-) diff --git a/src/compiler/glsl/ast_to_hir.cpp b/src/compiler/glsl/ast_to_hir.cpp index c338ad7..7de164c 100644 --- a/src/compiler/glsl/ast_to_hir.cpp +++ b/src/compiler/glsl/ast_to_hir.cpp @@ -7677,10 +7677,6 @@ ast_interface_block::hir(exec_list *instructions, "invalid qualifier for block", this->block_name); - /* The ast_interface_block has a list of ast_declarator_lists. We -* need to turn those into ir_variables with an association -* with this uniform block. -*/ enum glsl_interface_packing packing; if (this->layout.flags.q.shared) { packing = GLSL_INTERFACE_PACKING_SHARED; -- 2.9.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [Intel-gfx] [PATCH 3/3] intel: Make driver aware of MOCS table version
On 17-07-07 09:28:08, Jason Ekstrand wrote: On Thu, Jul 6, 2017 at 4:27 PM, Ben Widawskywrote: We don't yet have optimal MOCS settings, but we have enough to know how to at least determine when we might have non-optimal settings within our driver. Signed-off-by: Ben Widawsky --- src/intel/vulkan/anv_device.c | 12 src/intel/vulkan/anv_private.h| 2 ++ src/mesa/drivers/dri/i915/intel_context.c | 7 ++- src/mesa/drivers/dri/i965/intel_screen.c | 14 ++ src/mesa/drivers/dri/i965/intel_screen.h | 2 ++ 5 files changed, 36 insertions(+), 1 deletion(-) diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c index 3dc55dbb8d..8e180dbf18 100644 --- a/src/intel/vulkan/anv_device.c +++ b/src/intel/vulkan/anv_device.c @@ -368,6 +368,18 @@ anv_physical_device_init(struct anv_physical_device *device, device->info.max_cs_threads = max_cs_threads; } + if (device->info.gen >= 9) { + device->mocs_version = anv_gem_get_param(fd, + I915_PARAM_MOCS_TABLE_VERSION); + switch (device->mocs_version) { + default: + anv_perf_warn("Kernel exposes newer MOCS table\n"); A perf_warn here seems reasonable though it makes more sense to me to make it if (device->mocs_version > ANV_MAX_KNOWN_MOCS_VERSION) anv_perf_warn("..."); One thing to keep in mind: the max MOCS version can vary by platform (hopefully it doesn't). + case 1: + case 0: + device->mocs_version = MOCS_TABLE_VERSION; Why are we stomping device->mocs_version to MOCS_TABLE_VERSION? Are you just trying to avoid the version 0? If so, why not just have /* If the MOCS_TABLE_VERSION query fails, assume version 1 */ if (device->mocs_version == 0) device->mocs_version = 1; I think the switch looks better, especially as the versions increase. I don't think we want to have it dependent on a #define in an external header file. What if someone updates it for i965 and doesn't update anv or vice-versa? Yeah, I am removing that external define as mentioned in the other thread. I think it was a bad idea that I jammed in at the last minute. + } + } + brw_process_intel_debug_variable(); device->compiler = brw_compiler_create(NULL, >info); diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_ private.h index 573778dad5..b8241a9b22 100644 --- a/src/intel/vulkan/anv_private.h +++ b/src/intel/vulkan/anv_private.h @@ -684,6 +684,8 @@ struct anv_physical_device { uint32_teu_total; uint32_tsubslice_total; +uint8_t mocs_version; + struct { uint32_t type_count; struct anv_memory_type types[VK_MAX_MEMORY_TYPES]; diff --git a/src/mesa/drivers/dri/i915/intel_context.c b/src/mesa/drivers/dri/i915/intel_context.c index e0766a0e3f..9169ea650e 100644 --- a/src/mesa/drivers/dri/i915/intel_context.c +++ b/src/mesa/drivers/dri/i915/intel_context.c @@ -521,8 +521,13 @@ intelInitContext(struct intel_context *intel, INTEL_DEBUG = parse_debug_string(getenv("INTEL_DEBUG"), debug_control); if (INTEL_DEBUG & DEBUG_BUFMGR) dri_bufmgr_set_debug(intel->bufmgr, true); - if (INTEL_DEBUG & DEBUG_PERF) + if (INTEL_DEBUG & DEBUG_PERF) { intel->perf_debug = true; + if (screen->mocs_version > MOCS_TABLE_VERSION) { + fprintf(stderr, "Kernel exposes newer MOCS table\n"); + screen->mocs_version = MOCS_TABLE_VERSION; + } + } if (INTEL_DEBUG & DEBUG_AUB) drm_intel_bufmgr_gem_set_aub_dump(intel->bufmgr, true); diff --git a/src/mesa/drivers/dri/i965/intel_screen.c b/src/mesa/drivers/dri/i965/intel_screen.c index c75f2125d4..c53f133d49 100644 --- a/src/mesa/drivers/dri/i965/intel_screen.c +++ b/src/mesa/drivers/dri/i965/intel_screen.c @@ -2301,6 +2301,20 @@ __DRIconfig **intelInitScreen2(__DRIscreen *dri_screen) (ret != -1 || errno != EINVAL); } + if (devinfo->gen >= 9) { + screen->mocs_version = intel_get_integer(screen, + I915_PARAM_MOCS_TABLE_VERSION); + switch (screen->mocs_version) { + case 1: + case 0: + screen->mocs_version = MOCS_TABLE_VERSION; Same comments apply here. + break; + default: + /* We want to perf debug, but we can't yet */ + break; + } + } + dri_screen->extensions = !screen->has_context_reset_notification ? screenExtensions : intelRobustScreenExtensions; diff --git a/src/mesa/drivers/dri/i965/intel_screen.h b/src/mesa/drivers/dri/i965/intel_screen.h index f78b3e8f74..eb801f8155 100644 --- a/src/mesa/drivers/dri/i965/intel_screen.h +++ b/src/mesa/drivers/dri/i965/intel_screen.h @@ -112,6 +112,8 @@ struct intel_screen bool mesa_format_supports_texture[MESA_FORMAT_COUNT]; bool mesa_format_supports_render[MESA_FORMAT_COUNT]; enum
Re: [Mesa-dev] [EGL android: accquire fence implementation 2/2] i965: Queue the buffer with a sync fence for Android OS v4.1
Hi Zhongmin, Thanks for the update. Please see my comments inline. On Fri, Jul 21, 2017 at 12:08 PM, Zhongmin Wuwrote: > Before we queued the buffer with a invalid fence (-1), it will > make some benchmarks failed to test such as flatland. > > Now we get the out fence during the flushing buffer and then pass > it to SurfaceFlinger in eglSwapbuffer function. > > v2: a) Also implement the fence in cancelBuffer. > b) The last sync fence is stored in drawable object >rather than brw context. > c) format clear. > > v3: a) Save the last fence fd in DRI Context object. > b) Return the last fence if the batch buffer is empty and >nothing to be flushed when _intel_batchbuffer_flush_fence > c) Add the new interface in vbtl to set the retrieve fence > > v3.1 a) close fd in the new vbtl interface on none Android platform > > v4: a) The last fence is saved in brw context. > b) The retrieve fd is for all the platform but not just Android > c) Add a uniform dri2 interface to initialize the surface. > > v4.1: a) make some changes of variable name. > b) the patch is breaked into two patches. > > Change-Id: Ided54d2e193cde73a6f0feb36ac1c0056e4958f2 > Signed-off-by: Zhongmin Wu > --- > src/egl/drivers/dri2/egl_dri2.c | 45 > +++ > src/egl/drivers/dri2/egl_dri2.h |5 +++ > src/egl/drivers/dri2/platform_android.c | 11 --- > src/egl/drivers/dri2/platform_drm.c |2 +- > src/egl/drivers/dri2/platform_surfaceless.c |2 +- > src/egl/drivers/dri2/platform_wayland.c |2 +- > src/egl/drivers/dri2/platform_x11.c |2 +- > src/egl/drivers/dri2/platform_x11_dri3.c|2 +- > 8 files changed, 62 insertions(+), 9 deletions(-) > > diff --git a/src/egl/drivers/dri2/egl_dri2.c b/src/egl/drivers/dri2/egl_dri2.c > index 020a0bc..df4e934 100644 > --- a/src/egl/drivers/dri2/egl_dri2.c > +++ b/src/egl/drivers/dri2/egl_dri2.c > @@ -1307,6 +1307,25 @@ dri2_destroy_context(_EGLDriver *drv, _EGLDisplay > *disp, _EGLContext *ctx) > return EGL_TRUE; > } > > +EGLBoolean > +dri2_surf_init(_EGLSurface *surf, _EGLDisplay *dpy, EGLint type, > +_EGLConfig *conf, const EGLint *attrib_list) > +{ > + struct dri2_egl_surface *dri2_surf = dri2_egl_surface(surf); > + dri2_surf->out_fence_fd = -1; > + return _eglInitSurface(surf, dpy, type, conf, attrib_list); > +} > + > +static void > +dri2_surface_set_retrieve_fence( _EGLSurface *surf, int fence_fd) I think you forgot to rename this function too. (dri2_surface_set_out_fence). > +{ > + struct dri2_egl_surface *dri2_surf = dri2_egl_surface(surf); > + if (dri2_surf->out_fence_fd >=0) > + close(dri2_surf->out_fence_fd); > + > + dri2_surf->out_fence_fd = fence_fd; > +} > + > static EGLBoolean > dri2_destroy_surface(_EGLDriver *drv, _EGLDisplay *dpy, _EGLSurface *surf) > { > @@ -1315,9 +1334,26 @@ dri2_destroy_surface(_EGLDriver *drv, _EGLDisplay > *dpy, _EGLSurface *surf) > if (!_eglPutSurface(surf)) >return EGL_TRUE; > > + dri2_surface_set_retrieve_fence(surf, -1); Hmm, if we set it here, we would end up with the ->destroy_surface() callback seeing -1 as the fence FD. For Android that would mean that cancel_buffer() is called without a fence. What I had in my mind was adding a dri2_surf_destroy() function that would be called by platform backends before freeing the surf struct (analogically to dri2_surf_init() after allocating the struct). > return dri2_dpy->vtbl->destroy_surface(drv, dpy, surf); > } > Other than the above, I think this looks reasonably. However, depending on how costly inserting a fence is (I think it might mean flushing a command buffer on some platforms) we might need a mechanism for the platform backend to opt-in for fences, i.e. tell the dri2 core code that it's interested in them, rather than requesting them by default. I'd like to hear more opinions on this, though. Best regards, Tomasz ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [EGL android: accquire fence implementation 2/2] i965: Queue the buffer with a sync fence for Android OS v4.1
Before we queued the buffer with a invalid fence (-1), it will make some benchmarks failed to test such as flatland. Now we get the out fence during the flushing buffer and then pass it to SurfaceFlinger in eglSwapbuffer function. v2: a) Also implement the fence in cancelBuffer. b) The last sync fence is stored in drawable object rather than brw context. c) format clear. v3: a) Save the last fence fd in DRI Context object. b) Return the last fence if the batch buffer is empty and nothing to be flushed when _intel_batchbuffer_flush_fence c) Add the new interface in vbtl to set the retrieve fence v3.1 a) close fd in the new vbtl interface on none Android platform v4: a) The last fence is saved in brw context. b) The retrieve fd is for all the platform but not just Android c) Add a uniform dri2 interface to initialize the surface. v4.1: a) make some changes of variable name. b) the patch is breaked into two patches. Change-Id: Ided54d2e193cde73a6f0feb36ac1c0056e4958f2 Signed-off-by: Zhongmin Wu--- src/egl/drivers/dri2/egl_dri2.c | 45 +++ src/egl/drivers/dri2/egl_dri2.h |5 +++ src/egl/drivers/dri2/platform_android.c | 11 --- src/egl/drivers/dri2/platform_drm.c |2 +- src/egl/drivers/dri2/platform_surfaceless.c |2 +- src/egl/drivers/dri2/platform_wayland.c |2 +- src/egl/drivers/dri2/platform_x11.c |2 +- src/egl/drivers/dri2/platform_x11_dri3.c|2 +- 8 files changed, 62 insertions(+), 9 deletions(-) diff --git a/src/egl/drivers/dri2/egl_dri2.c b/src/egl/drivers/dri2/egl_dri2.c index 020a0bc..df4e934 100644 --- a/src/egl/drivers/dri2/egl_dri2.c +++ b/src/egl/drivers/dri2/egl_dri2.c @@ -1307,6 +1307,25 @@ dri2_destroy_context(_EGLDriver *drv, _EGLDisplay *disp, _EGLContext *ctx) return EGL_TRUE; } +EGLBoolean +dri2_surf_init(_EGLSurface *surf, _EGLDisplay *dpy, EGLint type, +_EGLConfig *conf, const EGLint *attrib_list) +{ + struct dri2_egl_surface *dri2_surf = dri2_egl_surface(surf); + dri2_surf->out_fence_fd = -1; + return _eglInitSurface(surf, dpy, type, conf, attrib_list); +} + +static void +dri2_surface_set_retrieve_fence( _EGLSurface *surf, int fence_fd) +{ + struct dri2_egl_surface *dri2_surf = dri2_egl_surface(surf); + if (dri2_surf->out_fence_fd >=0) + close(dri2_surf->out_fence_fd); + + dri2_surf->out_fence_fd = fence_fd; +} + static EGLBoolean dri2_destroy_surface(_EGLDriver *drv, _EGLDisplay *dpy, _EGLSurface *surf) { @@ -1315,9 +1334,26 @@ dri2_destroy_surface(_EGLDriver *drv, _EGLDisplay *dpy, _EGLSurface *surf) if (!_eglPutSurface(surf)) return EGL_TRUE; + dri2_surface_set_retrieve_fence(surf, -1); return dri2_dpy->vtbl->destroy_surface(drv, dpy, surf); } +static void +dri2_surf_get_fence_fd(_EGLContext *ctx, + _EGLDisplay *dpy, _EGLSurface *surf) +{ + struct dri2_egl_display *dri2_dpy = dri2_egl_display(dpy); + int fence_fd = -1; + __DRIcontext *dri_ctx = dri2_egl_context(ctx)->dri_context; + void * fence = dri2_dpy->fence->create_fence_fd(dri_ctx, -1); + if (fence) { + fence_fd = dri2_dpy->fence->get_fence_fd(dri2_dpy->dri_screen, + fence); + dri2_dpy->fence->destroy_fence(dri2_dpy->dri_screen, fence); + } + dri2_surface_set_retrieve_fence(surf, fence_fd); +} + /** * Called via eglMakeCurrent(), drv->API.MakeCurrent(). */ @@ -1352,8 +1388,11 @@ dri2_make_current(_EGLDriver *drv, _EGLDisplay *disp, _EGLSurface *dsurf, rdraw = (rsurf) ? dri2_dpy->vtbl->get_dri_drawable(rsurf) : NULL; cctx = (dri2_ctx) ? dri2_ctx->dri_context : NULL; + int fence_fd = -1; if (old_ctx) { __DRIcontext *old_cctx = dri2_egl_context(old_ctx)->dri_context; + if (old_dsurf) + dri2_surf_get_fence_fd(old_ctx, disp, old_dsurf); dri2_dpy->core->unbindContext(old_cctx); } @@ -1490,6 +1529,9 @@ static EGLBoolean dri2_swap_buffers(_EGLDriver *drv, _EGLDisplay *dpy, _EGLSurface *surf) { struct dri2_egl_display *dri2_dpy = dri2_egl_display(dpy); + _EGLContext *ctx = _eglGetCurrentContext(); + if (ctx && surf) + dri2_surf_get_fence_fd(ctx, dpy, surf); return dri2_dpy->vtbl->swap_buffers(drv, dpy, surf); } @@ -1499,6 +1541,9 @@ dri2_swap_buffers_with_damage(_EGLDriver *drv, _EGLDisplay *dpy, const EGLint *rects, EGLint n_rects) { struct dri2_egl_display *dri2_dpy = dri2_egl_display(dpy); + _EGLContext *ctx = _eglGetCurrentContext(); + if (ctx && surf) + dri2_surf_get_fence_fd(ctx, dpy, surf); return dri2_dpy->vtbl->swap_buffers_with_damage(drv, dpy, surf, rects, n_rects); } diff --git a/src/egl/drivers/dri2/egl_dri2.h b/src/egl/drivers/dri2/egl_dri2.h index bbba7c0..ca36dd9 100644 --- a/src/egl/drivers/dri2/egl_dri2.h
[Mesa-dev] [EGL android: accquire fence implementation 1/2] i965: Return the last fence if the batch buffer is empty and nothing to be flushed when _intel_batchbuffer_flush_fence.
Always save the last fence in the brw context when flushing buffer. If the buffer is nothing to be flushed, then return the last fence when asked for. Change-Id: Ic47035bcd1a27e402609afd9e2d1e3972548b97d Signed-off-by: Zhongmin Wu--- src/mesa/drivers/dri/i965/brw_context.c |5 + src/mesa/drivers/dri/i965/brw_context.h |1 + src/mesa/drivers/dri/i965/intel_batchbuffer.c | 16 ++-- 3 files changed, 20 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_context.c b/src/mesa/drivers/dri/i965/brw_context.c index 5433f90..ed0b056 100644 --- a/src/mesa/drivers/dri/i965/brw_context.c +++ b/src/mesa/drivers/dri/i965/brw_context.c @@ -1086,6 +1086,8 @@ brwCreateContext(gl_api api, ctx->VertexProgram._MaintainTnlProgram = true; ctx->FragmentProgram._MaintainTexEnvProgram = true; + brw->out_fence_fd = -1; + brw_draw_init( brw ); if ((flags & __DRI_CTX_FLAG_DEBUG) != 0) { @@ -1169,6 +1171,9 @@ intelDestroyContext(__DRIcontext * driContextPriv) brw->throttle_batch[1] = NULL; brw->throttle_batch[0] = NULL; + if (brw->out_fence_fd >= 0) + close(brw->out_fence_fd); + driDestroyOptionCache(>optionCache); /* free the Mesa context */ diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h index dc4bc8f..692ea2c 100644 --- a/src/mesa/drivers/dri/i965/brw_context.h +++ b/src/mesa/drivers/dri/i965/brw_context.h @@ -1217,6 +1217,7 @@ struct brw_context __DRIcontext *driContext; struct intel_screen *screen; + int out_fence_fd; }; /* brw_clear.c */ diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.c b/src/mesa/drivers/dri/i965/intel_batchbuffer.c index 62d2fe8..d342e5d 100644 --- a/src/mesa/drivers/dri/i965/intel_batchbuffer.c +++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.c @@ -648,9 +648,18 @@ do_flush_locked(struct brw_context *brw, int in_fence_fd, int *out_fence_fd) /* Add the batch itself to the end of the validation list */ add_exec_bo(batch, batch->bo); + if (brw->out_fence_fd >= 0) { +close(brw->out_fence_fd); +brw->out_fence_fd = -1; + } + + int fd = -1; ret = execbuffer(dri_screen->fd, batch, hw_ctx, 4 * USED_BATCH(*batch), - in_fence_fd, out_fence_fd, flags); + in_fence_fd, , flags); + brw->out_fence_fd = fd; + if (out_fence_fd) +*out_fence_fd = (fd >=0) ? dup(fd) : -1; } throttle(brw); @@ -684,8 +693,11 @@ _intel_batchbuffer_flush_fence(struct brw_context *brw, { int ret; - if (USED_BATCH(brw->batch) == 0) + if (USED_BATCH(brw->batch) == 0) { + if (out_fence_fd && brw->out_fence_fd >= 0) + *out_fence_fd = dup(brw->out_fence_fd); return 0; + } if (brw->throttle_batch[0] == NULL) { brw->throttle_batch[0] = brw->batch.bo; -- 1.7.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [EGL android: accquire fence implementation 1/2] i965: Return the last fence if the batch buffer is empty and nothing to be flushed when _intel_batchbuffer_flush_fence.
Always save the last fence in the brw context when flushing buffer. If the buffer is nothing to be flushed, then return the last fence when asked for. Change-Id: Ic47035bcd1a27e402609afd9e2d1e3972548b97d Signed-off-by: Zhongmin Wu--- src/mesa/drivers/dri/i965/brw_context.c |5 + src/mesa/drivers/dri/i965/brw_context.h |1 + src/mesa/drivers/dri/i965/intel_batchbuffer.c | 16 ++-- 3 files changed, 20 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_context.c b/src/mesa/drivers/dri/i965/brw_context.c index 5433f90..ed0b056 100644 --- a/src/mesa/drivers/dri/i965/brw_context.c +++ b/src/mesa/drivers/dri/i965/brw_context.c @@ -1086,6 +1086,8 @@ brwCreateContext(gl_api api, ctx->VertexProgram._MaintainTnlProgram = true; ctx->FragmentProgram._MaintainTexEnvProgram = true; + brw->out_fence_fd = -1; + brw_draw_init( brw ); if ((flags & __DRI_CTX_FLAG_DEBUG) != 0) { @@ -1169,6 +1171,9 @@ intelDestroyContext(__DRIcontext * driContextPriv) brw->throttle_batch[1] = NULL; brw->throttle_batch[0] = NULL; + if (brw->out_fence_fd >= 0) + close(brw->out_fence_fd); + driDestroyOptionCache(>optionCache); /* free the Mesa context */ diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h index dc4bc8f..692ea2c 100644 --- a/src/mesa/drivers/dri/i965/brw_context.h +++ b/src/mesa/drivers/dri/i965/brw_context.h @@ -1217,6 +1217,7 @@ struct brw_context __DRIcontext *driContext; struct intel_screen *screen; + int out_fence_fd; }; /* brw_clear.c */ diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.c b/src/mesa/drivers/dri/i965/intel_batchbuffer.c index 62d2fe8..d342e5d 100644 --- a/src/mesa/drivers/dri/i965/intel_batchbuffer.c +++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.c @@ -648,9 +648,18 @@ do_flush_locked(struct brw_context *brw, int in_fence_fd, int *out_fence_fd) /* Add the batch itself to the end of the validation list */ add_exec_bo(batch, batch->bo); + if (brw->out_fence_fd >= 0) { +close(brw->out_fence_fd); +brw->out_fence_fd = -1; + } + + int fd = -1; ret = execbuffer(dri_screen->fd, batch, hw_ctx, 4 * USED_BATCH(*batch), - in_fence_fd, out_fence_fd, flags); + in_fence_fd, , flags); + brw->out_fence_fd = fd; + if (out_fence_fd) +*out_fence_fd = (fd >=0) ? dup(fd) : -1; } throttle(brw); @@ -684,8 +693,11 @@ _intel_batchbuffer_flush_fence(struct brw_context *brw, { int ret; - if (USED_BATCH(brw->batch) == 0) + if (USED_BATCH(brw->batch) == 0) { + if (out_fence_fd && brw->out_fence_fd >= 0) + *out_fence_fd = dup(brw->out_fence_fd); return 0; + } if (brw->throttle_batch[0] == NULL) { brw->throttle_batch[0] = brw->batch.bo; -- 1.7.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [RFC PATCH] radeonsi: set a per-buffer flag that disables inter-process sharing (v2)
On 2017年07月20日 22:59, Marek Olšák wrote: On Jul 19, 2017 10:21 PM, "zhoucm1"> wrote: On 2017年07月19日 23:34, Marek Olšák wrote: On Jul 19, 2017 3:36 AM, "zhoucm1" > wrote: On 2017年07月19日 04:08, Marek Olšák wrote: From: Marek Olšák > For lower overhead in the CS ioctl. Winsys allocators are not used with interprocess-sharable resources. Hi Marek, Could I know from how your this way reduces overhead in CS ioctl? reusing BO to short bo list? The kernel part of the work hasn't been done yet. The idea is that nonsharable buffers don't have to be revalidated by TTM, OK, Maybe I only can see the whole picture of this idea when you complete kernel part. Out of curious, why/how can nonsharable buffers be revalidated by TTM without exposing like amdgpu_bo_make_resident api? I think the idea is that all nonsharable buffers will be backed by the same reservation object, so TTM can skip buffer validation if no buffer has been moved. It's just an optimization for the current design. With mentioned in another thread, if we can expose make_resident api, we can remove bo_list, even we can remove reservation operation in CS ioctl. And now, I think our bo list is a very bad design, first, umd must create bo list for every command submission, this is a extra cpu overhead compared with traditional way. second, kernel also have to iterate the list, when bo list is too long, like OpenCL program, they always throw several thousands BOs to bo list, reservation must keep these thousands ww_mutex safe, CPU overhead is too big. So I strongly suggest we should expose make_resident api to user space. if cannot, I want to know any specific reason to see if we can solve it. Yeah, I think the BO list idea is likely to die sooner or later. It made sense for GL before bindless was a thing. Nowadays I don't see much value in it. MesaGL will keep tracking the BO list because it's a requirement for good GL performance (it determines whether to flush IBs before BO synchronization, it allows tracking fences for each BO, which are used to determine dependencies between IBs, and that all allows async SDMA and async compute for GL, which doesn't have separate queues). However, we don't need any BO list at the libdrm level and lower. I think a BO_CREATE flag that causes that the buffer is added to a kernel-side per-fd BO list would be sufficient. How the kernel manages its BO list should be its own implementation detail. Initially we can just move the current BO list management into the kernel. I guess this idea will make bo list worse, which just decrease umd effort, but increase kernel driver complication. First, from your and Christian's comments, we can get this agreement that bo list design is not a good way. My proposal of exposing amdgpu_bo_make_resident is to replace bo list. If we can make all needed bo resident, then we don't need to validate it again in cs ioctl, then we don't need their reservation lock more. After job pushed to scheduler, then we can un-resident BOs. Even we can make it for VM bo, then we don't need to check vm update again while done in va map ioctl. If this is got done(eviction has been improved more), I cannot see any obvious gap for performance. What do you think of this proposal of exposing amdgpu_bo_make_resident api to user space? Or any other idea we can discuss. If you all agree with, I can volunteer to try with UMD guys. Regards, David Zhou Marek Regards, David Zhou so it should remove a lot of kernel overhead and the BO list remains the same. Marek Thanks, David Zhou v2: It shouldn't crash anymore, but the kernel will reject the new flag. --- src/gallium/drivers/radeon/r600_buffer_common.c | 7 + src/gallium/drivers/radeon/radeon_winsys.h | 20 +++--- src/gallium/winsys/amdgpu/drm/amdgpu_bo.c | 36 - src/gallium/winsys/radeon/drm/radeon_drm_bo.c | 27 +++ 4 files changed, 62 insertions(+), 28 deletions(-) diff --git a/src/gallium/drivers/radeon/r600_buffer_common.c b/src/gallium/drivers/radeon/r600_buffer_common.c index dd1c209..2747ac4 100644 --- a/src/gallium/drivers/radeon/r600_buffer_common.c +++ b/src/gallium/drivers/radeon/r600_buffer_common.c @@ -160,20 +160,27 @@ void r600_init_resource_fields(struct r600_common_screen *rscreen, } /*
[Mesa-dev] [PATCH 1/3] ac/gpu: add code to detect if kernel supports sync objects.
From: Dave AirlieSigned-off-by: Dave Airlie --- src/amd/common/ac_gpu_info.c | 9 + src/amd/common/ac_gpu_info.h | 1 + 2 files changed, 10 insertions(+) diff --git a/src/amd/common/ac_gpu_info.c b/src/amd/common/ac_gpu_info.c index ced7183..929dfd2 100644 --- a/src/amd/common/ac_gpu_info.c +++ b/src/amd/common/ac_gpu_info.c @@ -84,6 +84,14 @@ static unsigned cik_get_num_tile_pipes(struct amdgpu_gpu_info *info) } } +static bool has_syncobj(int fd) +{ + uint64_t value; + if (drmGetCap(fd, DRM_CAP_SYNCOBJ, )) + return false; + return value ? true : false; +} + bool ac_query_gpu_info(int fd, amdgpu_device_handle dev, struct radeon_info *info, struct amdgpu_gpu_info *amdinfo) @@ -258,6 +266,7 @@ bool ac_query_gpu_info(int fd, amdgpu_device_handle dev, info->vce_fw_version = vce.available_rings ? vce_version : 0; info->has_userptr = true; + info->has_syncobj = has_syncobj(fd); info->num_render_backends = amdinfo->rb_pipes; info->clock_crystal_freq = amdinfo->gpu_counter_freq; if (!info->clock_crystal_freq) { diff --git a/src/amd/common/ac_gpu_info.h b/src/amd/common/ac_gpu_info.h index 72a8506..20907c2 100644 --- a/src/amd/common/ac_gpu_info.h +++ b/src/amd/common/ac_gpu_info.h @@ -76,6 +76,7 @@ struct radeon_info { uint32_tdrm_minor; uint32_tdrm_patchlevel; boolhas_userptr; + boolhas_syncobj; /* Shader cores. */ uint32_tr600_max_quad_pipes; /* wave size / 16 */ -- 2.9.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/3] radv: initial support for shared semaphores
From: Dave AirlieThis adds support for sharing semaphores using kernel syncobjects. Syncobj backed semaphores are used for any semaphore which is created with external flags, and when a semaphore is imported, otherwise we use the current non-kernel semaphores. Temporary imports from syncobj fd are also available, these just override the current user until the next wait, when the temp syncobj is dropped. Signed-off-by: Dave Airlie --- src/amd/vulkan/radv_device.c | 248 +++--- src/amd/vulkan/radv_entrypoints_gen.py| 3 + src/amd/vulkan/radv_private.h | 16 +- src/amd/vulkan/radv_radeon_winsys.h | 21 ++- src/amd/vulkan/radv_wsi.c | 30 +++- src/amd/vulkan/winsys/amdgpu/radv_amdgpu_cs.c | 106 +++ 6 files changed, 354 insertions(+), 70 deletions(-) diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c index d87be66..44bee5c 100644 --- a/src/amd/vulkan/radv_device.c +++ b/src/amd/vulkan/radv_device.c @@ -102,6 +102,10 @@ static const VkExtensionProperties instance_extensions[] = { .extensionName = VK_KHR_EXTERNAL_MEMORY_CAPABILITIES_EXTENSION_NAME, .specVersion = 1, }, + { + .extensionName = VK_KHR_EXTERNAL_SEMAPHORE_CAPABILITIES_EXTENSION_NAME, + .specVersion = 1, + }, }; static const VkExtensionProperties common_device_extensions[] = { @@ -162,6 +166,16 @@ static const VkExtensionProperties common_device_extensions[] = { .specVersion = 1, }, }; +static const VkExtensionProperties ext_sema_device_extensions[] = { + { + .extensionName = VK_KHR_EXTERNAL_SEMAPHORE_EXTENSION_NAME, + .specVersion = 1, + }, + { + .extensionName = VK_KHR_EXTERNAL_SEMAPHORE_FD_EXTENSION_NAME, + .specVersion = 1, + }, +}; static VkResult radv_extensions_register(struct radv_instance *instance, @@ -312,6 +326,15 @@ radv_physical_device_init(struct radv_physical_device *device, if (result != VK_SUCCESS) goto fail; + if (device->rad_info.has_syncobj) { + result = radv_extensions_register(instance, + >extensions, + ext_sema_device_extensions, + ARRAY_SIZE(ext_sema_device_extensions)); + if (result != VK_SUCCESS) + goto fail; + } + fprintf(stderr, "WARNING: radv is not a conformant vulkan implementation, testing use only.\n"); device->name = get_chip_name(device->rad_info.family); @@ -1885,6 +1908,87 @@ fail: return VK_ERROR_OUT_OF_DEVICE_MEMORY; } +static VkResult radv_alloc_sem_counts(struct radv_winsys_sem_counts *counts, + int num_sems, + const VkSemaphore *sems, + bool reset_temp) +{ + int syncobj_idx = 0, sem_idx = 0; + + if (num_sems == 0) + return VK_SUCCESS; + for (uint32_t i = 0; i < num_sems; i++) { + RADV_FROM_HANDLE(radv_semaphore, sem, sems[i]); + + if (sem->temp_syncobj || sem->syncobj) + counts->syncobj_count++; + else + counts->sem_count++; + } + + if (counts->syncobj_count) { + counts->syncobj = (uint32_t *)malloc(sizeof(uint32_t) * counts->syncobj_count); + if (!counts->syncobj) + return VK_ERROR_OUT_OF_HOST_MEMORY; + } + + if (counts->sem_count) { + counts->sem = (struct radeon_winsys_sem **)malloc(sizeof(struct radeon_winsys_sem *) * counts->sem_count); + if (!counts->sem) + return VK_ERROR_OUT_OF_HOST_MEMORY; + } + + for (uint32_t i = 0; i < num_sems; i++) { + RADV_FROM_HANDLE(radv_semaphore, sem, sems[i]); + + if (sem->temp_syncobj) { + counts->syncobj[syncobj_idx++] = sem->temp_syncobj; + if (reset_temp) { + /* after we wait on a temp import - drop it */ + sem->temp_syncobj = 0; + } + } + else if (sem->syncobj) + counts->syncobj[syncobj_idx++] = sem->syncobj; + else { + assert(sem->sem); + counts->sem[sem_idx++] = sem->sem; + } + } + + return VK_SUCCESS; +} + +void radv_free_sem_info(struct radv_winsys_sem_info *sem_info) +{ + free(sem_info->wait.syncobj); + free(sem_info->wait.sem); + free(sem_info->signal.syncobj); +
[Mesa-dev] [PATCH 2/3] radv/winsys: add syncobj hooks
From: Dave AirlieThis just adds syncobj create/destroy/export/import paths into the winsys interface. Signed-off-by: Dave Airlie --- src/amd/vulkan/radv_radeon_winsys.h | 8 ++ src/amd/vulkan/winsys/amdgpu/radv_amdgpu_cs.c | 36 +++ 2 files changed, 44 insertions(+) diff --git a/src/amd/vulkan/radv_radeon_winsys.h b/src/amd/vulkan/radv_radeon_winsys.h index cdcaeca..2f3990c 100644 --- a/src/amd/vulkan/radv_radeon_winsys.h +++ b/src/amd/vulkan/radv_radeon_winsys.h @@ -221,9 +221,17 @@ struct radeon_winsys { bool absolute, uint64_t timeout); + /* old semaphores - non shareable */ struct radeon_winsys_sem *(*create_sem)(struct radeon_winsys *ws); void (*destroy_sem)(struct radeon_winsys_sem *sem); + /* new shareable sync objects */ + int (*create_syncobj)(struct radeon_winsys *ws, uint32_t *handle); + void (*destroy_syncobj)(struct radeon_winsys *ws, uint32_t handle); + + int (*export_syncobj)(struct radeon_winsys *ws, uint32_t syncobj, int *fd); + int (*import_syncobj)(struct radeon_winsys *ws, int fd, uint32_t *syncobj); + }; static inline void radeon_emit(struct radeon_winsys_cs *cs, uint32_t value) diff --git a/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_cs.c b/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_cs.c index 93243df..6ed8f32 100644 --- a/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_cs.c +++ b/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_cs.c @@ -1172,6 +1172,38 @@ error_out: return r; } +static int radv_amdgpu_create_syncobj(struct radeon_winsys *_ws, + uint32_t *handle) +{ + struct radv_amdgpu_winsys *ws = radv_amdgpu_winsys(_ws); + return amdgpu_cs_create_syncobj(ws->dev, handle); +} + +static void radv_amdgpu_destroy_syncobj(struct radeon_winsys *_ws, + uint32_t handle) +{ + struct radv_amdgpu_winsys *ws = radv_amdgpu_winsys(_ws); + amdgpu_cs_destroy_syncobj(ws->dev, handle); +} + +static int radv_amdgpu_export_syncobj(struct radeon_winsys *_ws, + uint32_t syncobj, + int *fd) +{ + struct radv_amdgpu_winsys *ws = radv_amdgpu_winsys(_ws); + + return amdgpu_cs_export_syncobj(ws->dev, syncobj, fd); +} + +static int radv_amdgpu_import_syncobj(struct radeon_winsys *_ws, + int fd, + uint32_t *syncobj) +{ + struct radv_amdgpu_winsys *ws = radv_amdgpu_winsys(_ws); + + return amdgpu_cs_import_syncobj(ws->dev, fd, syncobj); +} + void radv_amdgpu_cs_init_functions(struct radv_amdgpu_winsys *ws) { ws->base.ctx_create = radv_amdgpu_ctx_create; @@ -1190,5 +1222,9 @@ void radv_amdgpu_cs_init_functions(struct radv_amdgpu_winsys *ws) ws->base.destroy_fence = radv_amdgpu_destroy_fence; ws->base.create_sem = radv_amdgpu_create_sem; ws->base.destroy_sem = radv_amdgpu_destroy_sem; + ws->base.create_syncobj = radv_amdgpu_create_syncobj; + ws->base.destroy_syncobj = radv_amdgpu_destroy_syncobj; + ws->base.export_syncobj = radv_amdgpu_export_syncobj; + ws->base.import_syncobj = radv_amdgpu_import_syncobj; ws->base.fence_wait = radv_amdgpu_fence_wait; } -- 2.9.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] radv shared semaphores (v2)
This is a rework of the code to support the temporary import semantics and also doesn't always use syncobjs to avoid the problem with having to signal the WSI semaphores for now. Dave. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2] radv: Fix descriptors for cube images with VK_IMAGE_USAGE_STORAGE_BIT
For whatever reason this patch is breaking DOOM. Gražvydas On Wed, Jul 12, 2017 at 12:29 PM, Alex Smithwrote: > If a cube image has VK_IMAGE_USAGE_STORAGE_BIT set, the type in an image > view's descriptor was set to a 2D array (and a few other fields adjusted > accordingly). This is correct when the image view is actually bound as a > storage image, but not when bound as a sampled image. In that case the > type should be set as a cube. > > Fix by generating 2 sets of descriptors at view creation time for both > storage and non-storage usage, and then choose between them based on > descriptor type when writing descriptor sets. > > v2: Generate storage descriptors for images with TRANSFER_DST, since > those may be used as storage images internally. > > Signed-off-by: Alex Smith > Reviewed-by: Bas Nieuwenhuizen > --- > src/amd/vulkan/radv_descriptor_set.c | 18 ++-- > src/amd/vulkan/radv_image.c | 79 > > src/amd/vulkan/radv_private.h| 6 +++ > 3 files changed, 74 insertions(+), 29 deletions(-) > > diff --git a/src/amd/vulkan/radv_descriptor_set.c > b/src/amd/vulkan/radv_descriptor_set.c > index ec7fd3d..b4a78aa 100644 > --- a/src/amd/vulkan/radv_descriptor_set.c > +++ b/src/amd/vulkan/radv_descriptor_set.c > @@ -603,11 +603,18 @@ write_image_descriptor(struct radv_device *device, >struct radv_cmd_buffer *cmd_buffer, >unsigned *dst, >struct radeon_winsys_bo **buffer_list, > + VkDescriptorType descriptor_type, >const VkDescriptorImageInfo *image_info) > { > RADV_FROM_HANDLE(radv_image_view, iview, image_info->imageView); > - memcpy(dst, iview->descriptor, 8 * 4); > - memcpy(dst + 8, iview->fmask_descriptor, 8 * 4); > + > + if (descriptor_type == VK_DESCRIPTOR_TYPE_STORAGE_IMAGE) { > + memcpy(dst, iview->storage_descriptor, 8 * 4); > + memcpy(dst + 8, iview->storage_fmask_descriptor, 8 * 4); > + } else { > + memcpy(dst, iview->descriptor, 8 * 4); > + memcpy(dst + 8, iview->fmask_descriptor, 8 * 4); > + } > > if (cmd_buffer) > device->ws->cs_add_buffer(cmd_buffer->cs, iview->bo, 7); > @@ -620,12 +627,13 @@ write_combined_image_sampler_descriptor(struct > radv_device *device, > struct radv_cmd_buffer *cmd_buffer, > unsigned *dst, > struct radeon_winsys_bo **buffer_list, > + VkDescriptorType descriptor_type, > const VkDescriptorImageInfo > *image_info, > bool has_sampler) > { > RADV_FROM_HANDLE(radv_sampler, sampler, image_info->sampler); > > - write_image_descriptor(device, cmd_buffer, dst, buffer_list, > image_info); > + write_image_descriptor(device, cmd_buffer, dst, buffer_list, > descriptor_type, image_info); > /* copy over sampler state */ > if (has_sampler) > memcpy(dst + 16, sampler->state, 16); > @@ -696,10 +704,12 @@ void radv_update_descriptor_sets( > case VK_DESCRIPTOR_TYPE_STORAGE_IMAGE: > case VK_DESCRIPTOR_TYPE_INPUT_ATTACHMENT: > write_image_descriptor(device, cmd_buffer, > ptr, buffer_list, > + > writeset->descriptorType, >writeset->pImageInfo + > j); > break; > case VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER: > > write_combined_image_sampler_descriptor(device, cmd_buffer, ptr, buffer_list, > + > writeset->descriptorType, > > writeset->pImageInfo + j, > > !binding_layout->immutable_samplers_offset); > if (copy_immutable_samplers) { > @@ -866,10 +876,12 @@ void radv_update_descriptor_set_with_template(struct > radv_device *device, > case VK_DESCRIPTOR_TYPE_STORAGE_IMAGE: > case VK_DESCRIPTOR_TYPE_INPUT_ATTACHMENT: > write_image_descriptor(device, cmd_buffer, > pDst, buffer_list, > + > templ->entry[i].descriptor_type, >(struct > VkDescriptorImageInfo *) pSrc); > break; >
Re: [Mesa-dev] [PATCH V2] i965 : Optimize atom state flag checks
> Just some style comments, feel free to ignore them. > Both comments are relevant, will address them in V3. Thanks Lionel. > On 20/07/17 12:35, aravindan.muthuku...@intel.com wrote: > > From: Aravindan Muthukumar> > > > This patch improves CPI Rate(Cycles per Instruction) and branch > > mispredict for i965. The function check_state() was showing CPI > > retired rate. > > > > Performance stats with android: > > CPI retired lowered by 28% (lower is better) Branch missprediction > > lowered by 13% (lower is better) 3DMark improved by 2% > > > > The dissassembly doesn't show difference, although above results were > > observed with patch. > > > > Signed-off-by: Aravindan Muthukumar > > Signedd-off-by: Yogesh Marathe > > Tested-by: Asish > > --- > > > > Changes since V1: > > - Removed memset() change > > - Changed commit message as per review comments > > > > src/mesa/drivers/dri/i965/brw_defines.h | 4 > > src/mesa/drivers/dri/i965/brw_state_upload.c | 12 > > 2 files changed, 12 insertions(+), 4 deletions(-) > > > > diff --git a/src/mesa/drivers/dri/i965/brw_defines.h > > b/src/mesa/drivers/dri/i965/brw_defines.h > > index 2a8dbf8..8c9a510 100644 > > --- a/src/mesa/drivers/dri/i965/brw_defines.h > > +++ b/src/mesa/drivers/dri/i965/brw_defines.h > > @@ -1687,3 +1687,7 @@ enum brw_pixel_shader_coverage_mask_mode { > > # define CSDBG2_CONSTANT_BUFFER_ADDRESS_OFFSET_DISABLE (1 << 4) > > > > #endif > > + > > +/* Checking the state of mesa and brw before emitting atoms */ > > +#define CHECK_BRW_STATE(a,b) ((a.mesa & b.mesa) | (a.brw & b.brw)) > > + > > diff --git a/src/mesa/drivers/dri/i965/brw_state_upload.c > > b/src/mesa/drivers/dri/i965/brw_state_upload.c > > index acaa97e..1c8b969 100644 > > --- a/src/mesa/drivers/dri/i965/brw_state_upload.c > > +++ b/src/mesa/drivers/dri/i965/brw_state_upload.c > > @@ -443,10 +443,8 @@ check_and_emit_atom(struct brw_context *brw, > > struct brw_state_flags *state, > > const struct brw_tracked_state *atom) > > { > > - if (check_state(state, >dirty)) { > > atom->emit(brw); > > merge_ctx_state(brw, state); > > - } > > You might want to re-indent this. > Also maybe that function can be rename since it won't check anything anymore. > > > } > > > > static inline void > > @@ -541,7 +539,10 @@ brw_upload_pipeline_state(struct brw_context *brw, > > const struct brw_tracked_state *atom = [i]; > > struct brw_state_flags generated; > > > > - check_and_emit_atom(brw, , atom); > > + /* Checking the state and emitting atoms */ > > + if (CHECK_BRW_STATE(state, atom->dirty)) { > > +check_and_emit_atom(brw, , atom); > > + } > > > > accumulate_state(, >dirty); > > > > @@ -558,7 +559,10 @@ brw_upload_pipeline_state(struct brw_context *brw, > > for (i = 0; i < num_atoms; i++) { > > const struct brw_tracked_state *atom = [i]; > > > > - check_and_emit_atom(brw, , atom); > > + /* Checking the state and emitting atoms */ > > + if (CHECK_BRW_STATE(state, atom->dirty)) { > > +check_and_emit_atom(brw, , atom); > > + } > > } > > } > > > > > Why not replacing the last call to check_state() by CHECK_BRW_STATE() and get > rid of that function altogether? > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH V2] i965 : Optimize atom state flag checks
> -Original Message- > From: Ian Romanick [mailto:i...@freedesktop.org] > Sent: Friday, July 21, 2017 2:24 AM > To: Marathe, Yogesh; Muthukumar, Aravindan > ; mesa-dev@lists.freedesktop.org > Subject: Re: [Mesa-dev] [PATCH V2] i965 : Optimize atom state flag checks > > On 07/20/2017 12:57 PM, Marathe, Yogesh wrote: > > Ian, > > > >> -Original Message- > >> From: mesa-dev [mailto:mesa-dev-boun...@lists.freedesktop.org] On > >> Behalf Of Ian Romanick > >> Sent: Friday, July 21, 2017 12:33 AM > >> To: Muthukumar, Aravindan ; mesa- > >> d...@lists.freedesktop.org > >> Subject: Re: [Mesa-dev] [PATCH V2] i965 : Optimize atom state flag > >> checks > >> > >> On 07/20/2017 04:35 AM, aravindan.muthuku...@intel.com wrote: > >>> From: Aravindan Muthukumar > >>> > >>> This patch improves CPI Rate(Cycles per Instruction) and branch > >>> mispredict for i965. The function check_state() was showing CPI > >>> retired rate. > >>> > >>> Performance stats with android: > >>> CPI retired lowered by 28% (lower is better) Branch missprediction > >>> lowered by 13% (lower is better) 3DMark improved by 2% > >>> > >>> The dissassembly doesn't show difference, although above results > >>> were observed with patch. > >>> > >>> Signed-off-by: Aravindan Muthukumar > >>> Signedd-off-by: Yogesh Marathe > >> > >> Signed-off-by > > > > Thanks. Will correct it. May I add you and all who commented as Reviewed- > by? > > I won't make a V3 for this since its a change in commit msg. > > No. You don't add someone's R-b unless they actually say "Reviewed-by". > Various people still have issues with the content of this change. > Ok. Got that. Thanks. > >>> Tested-by: Asish > >>> --- > >>> > >>> Changes since V1: > >>> - Removed memset() change > >>> - Changed commit message as per review comments > >> > >> This information should be in the main part of the commit message. > >> > > > > Sure. > > > >>> > >>> src/mesa/drivers/dri/i965/brw_defines.h | 4 > >>> src/mesa/drivers/dri/i965/brw_state_upload.c | 12 > >>> 2 files changed, 12 insertions(+), 4 deletions(-) > >>> > >>> diff --git a/src/mesa/drivers/dri/i965/brw_defines.h > >>> b/src/mesa/drivers/dri/i965/brw_defines.h > >>> index 2a8dbf8..8c9a510 100644 > >>> --- a/src/mesa/drivers/dri/i965/brw_defines.h > >>> +++ b/src/mesa/drivers/dri/i965/brw_defines.h > >>> @@ -1687,3 +1687,7 @@ enum brw_pixel_shader_coverage_mask_mode { > # > >>> define CSDBG2_CONSTANT_BUFFER_ADDRESS_OFFSET_DISABLE (1 << 4) > >>> > >>> #endif > >>> + > >>> +/* Checking the state of mesa and brw before emitting atoms */ > >>> +#define CHECK_BRW_STATE(a,b) ((a.mesa & b.mesa) | (a.brw & b.brw)) > >>> + > >>> diff --git a/src/mesa/drivers/dri/i965/brw_state_upload.c > >>> b/src/mesa/drivers/dri/i965/brw_state_upload.c > >>> index acaa97e..1c8b969 100644 > >>> --- a/src/mesa/drivers/dri/i965/brw_state_upload.c > >>> +++ b/src/mesa/drivers/dri/i965/brw_state_upload.c > >>> @@ -443,10 +443,8 @@ check_and_emit_atom(struct brw_context *brw, > >>> struct brw_state_flags *state, > >>> const struct brw_tracked_state *atom) { > >>> - if (check_state(state, >dirty)) { > >>>atom->emit(brw); > >>>merge_ctx_state(brw, state); > >>> - } > >>> } > >>> > >>> static inline void > >>> @@ -541,7 +539,10 @@ brw_upload_pipeline_state(struct brw_context > *brw, > >>>const struct brw_tracked_state *atom = [i]; > >>>struct brw_state_flags generated; > >>> > >>> - check_and_emit_atom(brw, , atom); > >>> + /* Checking the state and emitting atoms */ > >>> + if (CHECK_BRW_STATE(state, atom->dirty)) { > >>> +check_and_emit_atom(brw, , atom); > >>> + } > >>> > >>>accumulate_state(, >dirty); > >>> > >>> @@ -558,7 +559,10 @@ brw_upload_pipeline_state(struct brw_context > *brw, > >>>for (i = 0; i < num_atoms; i++) { > >>>const struct brw_tracked_state *atom = [i]; > >>> > >>> - check_and_emit_atom(brw, , atom); > >>> + /* Checking the state and emitting atoms */ > >>> + if (CHECK_BRW_STATE(state, atom->dirty)) { > >>> +check_and_emit_atom(brw, , atom); > >>> + } > >>>} > >>> } > >>> > >>> > >> > >> ___ > >> mesa-dev mailing list > >> mesa-dev@lists.freedesktop.org > >> https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] compiler: move glsl_interface_packing enum to shader_enums.h
This allows us to drop the duplicate gl_uniform_block_packing enum. --- src/compiler/glsl/link_uniform_blocks.cpp | 11 +-- src/compiler/glsl_types.h | 9 ++--- src/compiler/shader_enums.h | 7 +++ src/mesa/main/mtypes.h| 11 +-- 4 files changed, 11 insertions(+), 27 deletions(-) diff --git a/src/compiler/glsl/link_uniform_blocks.cpp b/src/compiler/glsl/link_uniform_blocks.cpp index 249a767..ef2f29d 100644 --- a/src/compiler/glsl/link_uniform_blocks.cpp +++ b/src/compiler/glsl/link_uniform_blocks.cpp @@ -280,7 +280,7 @@ process_block_array_leaf(const char *name, blocks[i].Binding = (b->has_binding) ? b->binding + *binding_offset : 0; blocks[i].UniformBufferSize = 0; - blocks[i]._Packing = gl_uniform_block_packing(type->interface_packing); + blocks[i]._Packing = glsl_interface_packing(type->interface_packing); blocks[i]._RowMajor = type->get_interface_row_major(); blocks[i].linearized_array_index = linearized_index; @@ -354,15 +354,6 @@ create_buffer_blocks(void *mem_ctx, struct gl_context *ctx, */ ubo_visitor parcel(blocks, variables, num_variables, prog); - STATIC_ASSERT(unsigned(GLSL_INTERFACE_PACKING_STD140) - == unsigned(ubo_packing_std140)); - STATIC_ASSERT(unsigned(GLSL_INTERFACE_PACKING_SHARED) - == unsigned(ubo_packing_shared)); - STATIC_ASSERT(unsigned(GLSL_INTERFACE_PACKING_PACKED) - == unsigned(ubo_packing_packed)); - STATIC_ASSERT(unsigned(GLSL_INTERFACE_PACKING_STD430) - == unsigned(ubo_packing_std430)); - unsigned i = 0; struct hash_entry *entry; hash_table_foreach (block_hash, entry) { diff --git a/src/compiler/glsl_types.h b/src/compiler/glsl_types.h index 2857dc9..f67465e 100644 --- a/src/compiler/glsl_types.h +++ b/src/compiler/glsl_types.h @@ -28,6 +28,8 @@ #include #include +#include "shader_enums.h" + #ifdef __cplusplus extern "C" { #endif @@ -101,13 +103,6 @@ enum glsl_sampler_dim { GLSL_SAMPLER_DIM_SUBPASS_MS, /* for multisampled vulkan input attachments */ }; -enum glsl_interface_packing { - GLSL_INTERFACE_PACKING_STD140, - GLSL_INTERFACE_PACKING_SHARED, - GLSL_INTERFACE_PACKING_PACKED, - GLSL_INTERFACE_PACKING_STD430 -}; - enum glsl_matrix_layout { /** * The layout of the matrix is inherited from the object containing the diff --git a/src/compiler/shader_enums.h b/src/compiler/shader_enums.h index 352f270..2f20e68 100644 --- a/src/compiler/shader_enums.h +++ b/src/compiler/shader_enums.h @@ -567,6 +567,13 @@ enum glsl_interp_mode INTERP_MODE_COUNT /**< Number of interpolation qualifiers */ }; +enum glsl_interface_packing { + GLSL_INTERFACE_PACKING_STD140, + GLSL_INTERFACE_PACKING_SHARED, + GLSL_INTERFACE_PACKING_PACKED, + GLSL_INTERFACE_PACKING_STD430 +}; + const char *glsl_interp_mode_name(enum glsl_interp_mode qual); /** diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h index 3ce2df7..4970329 100644 --- a/src/mesa/main/mtypes.h +++ b/src/mesa/main/mtypes.h @@ -2645,15 +2645,6 @@ struct gl_uniform_buffer_variable }; -enum gl_uniform_block_packing -{ - ubo_packing_std140, - ubo_packing_shared, - ubo_packing_packed, - ubo_packing_std430 -}; - - struct gl_uniform_block { /** Declared name of the uniform block */ @@ -2699,7 +2690,7 @@ struct gl_uniform_block * This isn't accessible through the API, but it is used while * cross-validating uniform blocks. */ - enum gl_uniform_block_packing _Packing; + enum glsl_interface_packing _Packing; GLboolean _RowMajor; }; -- 2.9.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 03/20] nir: Support lowering vote intrinsics
On Tue, Jul 18, 2017 at 1:34 PM, Connor Abbottwrote: > On Mon, Jul 10, 2017 at 10:18 AM, Matt Turner wrote: >> On Thu, Jul 6, 2017 at 8:04 PM, Connor Abbott wrote: >>> On Thu, Jul 6, 2017 at 4:48 PM, Matt Turner wrote: ... trivially (as allowed by the spec!) by reusing the existing nir_opt_intrinsics code. --- src/compiler/nir/nir.h| 4 src/compiler/nir/nir_opt_intrinsics.c | 6 +++--- 2 files changed, 7 insertions(+), 3 deletions(-) diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h index 44a1d0887e..401c41f155 100644 --- a/src/compiler/nir/nir.h +++ b/src/compiler/nir/nir.h @@ -1821,6 +1821,10 @@ typedef struct nir_shader_compiler_options { bool lower_extract_byte; bool lower_extract_word; + bool lower_vote_any; + bool lower_vote_all; + bool lower_vote_eq; >>> >>> Since there are potentially multiple ways to lower these (voteAny(x) >>> -> !voteAll(!x), using ballotARB(), etc.), and the way they're lowered >>> is a little... unexpected (although admittedly legal!), why don't we >>> use a more descriptive name, like lower_vote_*_trivial? While we're at >>> it, I highly doubt that an implementation would want this kind of >>> lowering for just one of the intrinsics, so we can merge this into a >>> single flag, say lower_vote_trivial. >> >> Thanks, both good ideas. I've replaced all three fields with a >> lower_vote_trivial field. > > I had a closer look at your branch with the updated patch, and the > logic here, repeated in two places, seems backwards: > > if (!val || b.shader->options->lower_vote_trivial) >continue; > > This will skip processing the instruction at all if you set > lower_vote_trivial, even if val is non-NULL, which seems like the > opposite of what you want. Also, even once you fix this: > > if (!val && !b.shader->options->lower_vote_trivial) >continue; Indeed. Thanks. > > You'll still segfault in the vote_any/vote_all case if the source > isn't constant, since you'll try to dereference val when it doesn't > exist. You can fix this by changing the line below to: > > replacement = nir_ssa_for_src(, instr->src[0], 1); Needs to be s/instr/intrin/, but yeah, good catch :) > in the previous patch. I'm kinda nervous that lower_vote_trivial seems > untested, since it never would've worked as-is, but I can't see any > other problems so patches 2 & 3 get my R-b with these fixes. But you > might want to write some really simple vertex shader piglit tests, > even if you only use dynamically uniform arguments, to make sure this > is working correctly. Done. Tests will be on the piglit list shortly. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] mesa: remove pointless assignments in init_teximage_fields_ms()
This patch is Reviewed-by: Ian RomanickOn 07/20/2017 10:07 AM, Brian Paul wrote: > The NumSamples and FixedSampleLocation fields are set again later at > the end of the function so these earlier assignments aren't needed. > --- > src/mesa/main/teximage.c | 3 --- > 1 file changed, 3 deletions(-) > > diff --git a/src/mesa/main/teximage.c b/src/mesa/main/teximage.c > index c30f8ac..d55d9b0 100644 > --- a/src/mesa/main/teximage.c > +++ b/src/mesa/main/teximage.c > @@ -813,9 +813,6 @@ init_teximage_fields_ms(struct gl_context *ctx, > img->Width2 = width - 2 * border; /* == 1 << img->WidthLog2; */ > img->WidthLog2 = _mesa_logbase2(img->Width2); > > - img->NumSamples = 0; > - img->FixedSampleLocations = GL_TRUE; > - > switch(target) { > case GL_TEXTURE_1D: > case GL_TEXTURE_BUFFER: > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 101843] Latest mesa git fails to compile in mesa/main/marshal.c
https://bugs.freedesktop.org/show_bug.cgi?id=101843 Timothy Arcerichanged: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |WORKSFORME -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] st/mesa: add destroy_drawable interface
Looks good to me, but you might want to wait a day to see if there's any additional review feedback. Reviewed-by: Brian PaulOn 07/20/2017 12:26 PM, Charmaine Lee wrote: With this patch, the st manager will maintain a hash table for the active framebuffer interface objects. A destroy_drawable interface is added to allow the state tracker to notify the st manager to remove the associated framebuffer interface object from the hash table, so the associated framebuffer and its resources can be deleted at framebuffers purge time. Fixes bug 101829 "read-after-free in st_framebuffer_validate" Tested-by: Brad King Tested-by: Gert Wollny --- src/gallium/include/state_tracker/st_api.h| 7 ++ src/gallium/state_trackers/dri/dri_drawable.c | 6 +- src/gallium/state_trackers/glx/xlib/xm_api.c | 5 ++ src/gallium/state_trackers/glx/xlib/xm_st.c | 2 + src/gallium/state_trackers/wgl/stw_st.c | 6 +- src/mesa/state_tracker/st_manager.c | 95 ++- src/mesa/state_tracker/st_manager.h | 5 ++ 7 files changed, 123 insertions(+), 3 deletions(-) diff --git a/src/gallium/include/state_tracker/st_api.h b/src/gallium/include/state_tracker/st_api.h index 30a4866..9b660f7 100644 --- a/src/gallium/include/state_tracker/st_api.h +++ b/src/gallium/include/state_tracker/st_api.h @@ -552,6 +552,13 @@ struct st_api * Get the currently bound context in the calling thread. */ struct st_context_iface *(*get_current)(struct st_api *stapi); + + /** +* Notify the st manager the framebuffer interface object +* is no longer valid. +*/ + void (*destroy_drawable)(struct st_api *stapi, +struct st_framebuffer_iface *stfbi); }; /** diff --git a/src/gallium/state_trackers/dri/dri_drawable.c b/src/gallium/state_trackers/dri/dri_drawable.c index 0cfdc30..c7df0f6 100644 --- a/src/gallium/state_trackers/dri/dri_drawable.c +++ b/src/gallium/state_trackers/dri/dri_drawable.c @@ -169,6 +169,8 @@ void dri_destroy_buffer(__DRIdrawable * dPriv) { struct dri_drawable *drawable = dri_drawable(dPriv); + struct dri_screen *screen = drawable->screen; + struct st_api *stapi = screen->st_api; int i; pipe_surface_reference(>drisw_surface, NULL); @@ -180,7 +182,9 @@ dri_destroy_buffer(__DRIdrawable * dPriv) swap_fences_unref(drawable); - drawable->base.ID = 0; + /* Notify the st manager that this drawable is no longer valid */ + stapi->destroy_drawable(stapi, >base); + FREE(drawable); } diff --git a/src/gallium/state_trackers/glx/xlib/xm_api.c b/src/gallium/state_trackers/glx/xlib/xm_api.c index 881dd44..e4b1e9d 100644 --- a/src/gallium/state_trackers/glx/xlib/xm_api.c +++ b/src/gallium/state_trackers/glx/xlib/xm_api.c @@ -595,6 +595,11 @@ xmesa_free_buffer(XMesaBuffer buffer) */ b->ws.drawable = 0; + /* Notify the st manager that the associated framebuffer interface + * object is no longer valid. + */ + stapi->destroy_drawable(stapi, buffer->stfb); + /* XXX we should move the buffer to a delete-pending list and destroy * the buffer until it is no longer current. */ diff --git a/src/gallium/state_trackers/glx/xlib/xm_st.c b/src/gallium/state_trackers/glx/xlib/xm_st.c index 9e30efa..6a0f4aa 100644 --- a/src/gallium/state_trackers/glx/xlib/xm_st.c +++ b/src/gallium/state_trackers/glx/xlib/xm_st.c @@ -273,6 +273,7 @@ xmesa_st_framebuffer_flush_front(struct st_context_iface *stctx, return ret; } +static uint32_t xmesa_stfbi_ID = 0; struct st_framebuffer_iface * xmesa_create_st_framebuffer(XMesaDisplay xmdpy, XMesaBuffer b) @@ -302,6 +303,7 @@ xmesa_create_st_framebuffer(XMesaDisplay xmdpy, XMesaBuffer b) stfbi->visual = >stvis; stfbi->flush_front = xmesa_st_framebuffer_flush_front; stfbi->validate = xmesa_st_framebuffer_validate; + stfbi->ID = p_atomic_inc_return(_stfbi_ID); p_atomic_set(>stamp, 1); stfbi->st_manager_private = (void *) xstfb; diff --git a/src/gallium/state_trackers/wgl/stw_st.c b/src/gallium/state_trackers/wgl/stw_st.c index c2844b0..85a8b17 100644 --- a/src/gallium/state_trackers/wgl/stw_st.c +++ b/src/gallium/state_trackers/wgl/stw_st.c @@ -256,7 +256,11 @@ stw_st_destroy_framebuffer_locked(struct st_framebuffer_iface *stfb) for (i = 0; i < ST_ATTACHMENT_COUNT; i++) pipe_resource_reference(>textures[i], NULL); - stwfb->base.ID = 0; + /* Notify the st manager that the framebuffer interface is no +* longer valid. +*/ + stw_dev->stapi->destroy_drawable(stw_dev->stapi, >base); + FREE(stwfb); } diff --git a/src/mesa/state_tracker/st_manager.c b/src/mesa/state_tracker/st_manager.c index cb816de..ebc7ca8 100644 --- a/src/mesa/state_tracker/st_manager.c +++ b/src/mesa/state_tracker/st_manager.c @@ -38,6 +38,7 @@ #include
[Mesa-dev] [PATCH] travis: add build configuration for SWR AVX512
--- .travis.yml | 32 1 file changed, 32 insertions(+) diff --git a/.travis.yml b/.travis.yml index 246ad30eff..da491e0396 100644 --- a/.travis.yml +++ b/.travis.yml @@ -82,6 +82,38 @@ matrix: - libx11-xcb-dev - libelf-dev - env: +# NOTE: Building SWR is 2x (yes two) times slower than all the other +# gallium drivers combined. +# Start this early so that it doesn't hinder the run time. +- LABEL="make Gallium Drivers SWR AVX512" +- BUILD=make +- MAKEFLAGS="-j4" +- MAKE_CHECK_COMMAND="true" +- LLVM_VERSION=3.9 +- LLVM_CONFIG="llvm-config-${LLVM_VERSION}" +- OVERRIDE_CC="gcc-6.3" +- OVERRIDE_CXX="g++-6.3" +- DRI_LOADERS="--disable-glx --disable-gbm --disable-egl" +- DRI_DRIVERS="" +- GALLIUM_ST="--enable-dri --disable-opencl --disable-xa --disable-nine --disable-xvmc --disable-vdpau --disable-va --disable-omx --disable-gallium-osmesa --with-swr-archs=knl,skx" +- GALLIUM_DRIVERS="swr" +- VULKAN_DRIVERS="" + addons: +apt: + sources: +- llvm-toolchain-trusty-3.9 + packages: +# LLVM packaging is broken and misses these dependencies +- libedit-dev +# From sources above +- llvm-3.9-dev +# Common +- xz-utils +- x11proto-xf86vidmode-dev +- libexpat1-dev +- libx11-xcb-dev +- libelf-dev +- env: - LABEL="make Gallium Drivers Other" - BUILD=make - MAKEFLAGS="-j4" -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/3] swr/rast: cache line align hottile buffers
Prevents unalignment crashes with avx512 code on gcc/clang. --- src/gallium/drivers/swr/rasterizer/core/tilemgr.cpp | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/src/gallium/drivers/swr/rasterizer/core/tilemgr.cpp b/src/gallium/drivers/swr/rasterizer/core/tilemgr.cpp index eb60eb4081..a6c54ab86e 100644 --- a/src/gallium/drivers/swr/rasterizer/core/tilemgr.cpp +++ b/src/gallium/drivers/swr/rasterizer/core/tilemgr.cpp @@ -100,7 +100,7 @@ HOTTILE* HotTileMgr::GetHotTile(SWR_CONTEXT* pContext, DRAW_CONTEXT* pDC, uint32 { uint32_t size = numSamples * mHotTileSize[attachment]; uint32_t numaNode = ((x ^ y) & pContext->threadPool.numaMask); -hotTile.pBuffer = (uint8_t*)AllocHotTileMem(size, KNOB_SIMD_WIDTH * 4, numaNode); +hotTile.pBuffer = (uint8_t*)AllocHotTileMem(size, 64, numaNode); hotTile.state = HOTTILE_INVALID; hotTile.numSamples = numSamples; hotTile.renderTargetArrayIndex = renderTargetArrayIndex; @@ -124,7 +124,7 @@ HOTTILE* HotTileMgr::GetHotTile(SWR_CONTEXT* pContext, DRAW_CONTEXT* pDC, uint32 uint32_t size = numSamples * mHotTileSize[attachment]; uint32_t numaNode = ((x ^ y) & pContext->threadPool.numaMask); -hotTile.pBuffer = (uint8_t*)AllocHotTileMem(size, KNOB_SIMD_WIDTH * 4, numaNode); +hotTile.pBuffer = (uint8_t*)AllocHotTileMem(size, 64, numaNode); hotTile.state = HOTTILE_INVALID; hotTile.numSamples = numSamples; } @@ -194,7 +194,7 @@ HOTTILE* HotTileMgr::GetHotTileNoLoad( if (create) { uint32_t size = numSamples * mHotTileSize[attachment]; -hotTile.pBuffer = (uint8_t*)AlignedMalloc(size, KNOB_SIMD_WIDTH * 4); +hotTile.pBuffer = (uint8_t*)AlignedMalloc(size, 64); hotTile.state = HOTTILE_INVALID; hotTile.numSamples = numSamples; hotTile.renderTargetArrayIndex = 0; -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/3] swr/rast: fix memory paths for avx512 optimized avx/sse
Source/destination will not be AVX512 aligned, use the unaligned load/store intrinsics. --- .../drivers/swr/rasterizer/common/simdlib_128_avx512.inl | 10 +- .../drivers/swr/rasterizer/common/simdlib_256_avx512.inl | 10 +- 2 files changed, 10 insertions(+), 10 deletions(-) diff --git a/src/gallium/drivers/swr/rasterizer/common/simdlib_128_avx512.inl b/src/gallium/drivers/swr/rasterizer/common/simdlib_128_avx512.inl index aaa74146ad..012f3105e9 100644 --- a/src/gallium/drivers/swr/rasterizer/common/simdlib_128_avx512.inl +++ b/src/gallium/drivers/swr/rasterizer/common/simdlib_128_avx512.inl @@ -294,12 +294,12 @@ SIMD_IWRAPPER_2_8(unpacklo_epi8); //--- static SIMDINLINE Float SIMDCALL load_ps(float const *p) // return *p (loads SIMD width elements from memory) { -return __conv(_mm512_maskz_load_ps(__mmask16(0xf), p)); +return __conv(_mm512_maskz_loadu_ps(__mmask16(0xf), p)); } static SIMDINLINE Integer SIMDCALL load_si(Integer const *p) // return *p { -return __conv(_mm512_maskz_load_epi32(__mmask16(0xf), p)); +return __conv(_mm512_maskz_loadu_epi32(__mmask16(0xf), p)); } static SIMDINLINE Float SIMDCALL loadu_ps(float const *p) // return *p (same as load_ps but allows for unaligned mem) @@ -353,17 +353,17 @@ static SIMDINLINE void SIMDCALL maskstore_ps(float *p, Integer mask, Float src) { __mmask16 m = 0xf; m = _mm512_mask_test_epi32_mask(m, __conv(mask), _mm512_set1_epi32(0x8000)); -_mm512_mask_store_ps(p, m, __conv(src)); +_mm512_mask_storeu_ps(p, m, __conv(src)); } static SIMDINLINE void SIMDCALL store_ps(float *p, Float a)// *p = a (stores all elements contiguously in memory) { -_mm512_mask_store_ps(p, __mmask16(0xf), __conv(a)); +_mm512_mask_storeu_ps(p, __mmask16(0xf), __conv(a)); } static SIMDINLINE void SIMDCALL store_si(Integer *p, Integer a) // *p = a { -_mm512_mask_store_epi32(p, __mmask16(0xf), __conv(a)); +_mm512_mask_storeu_epi32(p, __mmask16(0xf), __conv(a)); } //=== diff --git a/src/gallium/drivers/swr/rasterizer/common/simdlib_256_avx512.inl b/src/gallium/drivers/swr/rasterizer/common/simdlib_256_avx512.inl index 5103bdafa2..a8d2a4b8bf 100644 --- a/src/gallium/drivers/swr/rasterizer/common/simdlib_256_avx512.inl +++ b/src/gallium/drivers/swr/rasterizer/common/simdlib_256_avx512.inl @@ -295,12 +295,12 @@ SIMD_IWRAPPER_2_8(unpacklo_epi8); //--- static SIMDINLINE Float SIMDCALL load_ps(float const *p) // return *p (loads SIMD width elements from memory) { -return __conv(_mm512_maskz_load_ps(__mmask16(0xff), p)); +return __conv(_mm512_maskz_loadu_ps(__mmask16(0xff), p)); } static SIMDINLINE Integer SIMDCALL load_si(Integer const *p) // return *p { -return __conv(_mm512_maskz_load_epi32(__mmask16(0xff), p)); +return __conv(_mm512_maskz_loadu_epi32(__mmask16(0xff), p)); } static SIMDINLINE Float SIMDCALL loadu_ps(float const *p) // return *p (same as load_ps but allows for unaligned mem) @@ -354,17 +354,17 @@ static SIMDINLINE void SIMDCALL maskstore_ps(float *p, Integer mask, Float src) { __mmask16 m = 0xff; m = _mm512_mask_test_epi32_mask(m, __conv(mask), _mm512_set1_epi32(0x8000)); -_mm512_mask_store_ps(p, m, __conv(src)); +_mm512_mask_storeu_ps(p, m, __conv(src)); } static SIMDINLINE void SIMDCALL store_ps(float *p, Float a)// *p = a (stores all elements contiguously in memory) { -_mm512_mask_store_ps(p, __mmask16(0xff), __conv(a)); +_mm512_mask_storeu_ps(p, __mmask16(0xff), __conv(a)); } static SIMDINLINE void SIMDCALL store_si(Integer *p, Integer a) // *p = a { -_mm512_mask_store_epi32(p, __mmask16(0xff), __conv(a)); +_mm512_mask_storeu_epi32(p, __mmask16(0xff), __conv(a)); } //=== -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/3] swr/rast: simdlib changes for clang/gcc
Tested with clang-4.0 and gcc-6.3. --- .../swr/rasterizer/common/simdlib_512_avx512.inl | 43 +- .../swr/rasterizer/common/simdlib_types.hpp| 2 +- 2 files changed, 35 insertions(+), 10 deletions(-) diff --git a/src/gallium/drivers/swr/rasterizer/common/simdlib_512_avx512.inl b/src/gallium/drivers/swr/rasterizer/common/simdlib_512_avx512.inl index 7d90b7d1b0..7447d35ee2 100644 --- a/src/gallium/drivers/swr/rasterizer/common/simdlib_512_avx512.inl +++ b/src/gallium/drivers/swr/rasterizer/common/simdlib_512_avx512.inl @@ -24,6 +24,21 @@ #error Do not include this file directly, use "simdlib.hpp" instead. #endif +#if defined(__GNUC__) && !defined( __clang__) && !defined(__INTEL_COMPILER) +// gcc missing these intrinsics +#ifndef _mm512_cmpneq_ps_mask +#define _mm512_cmpneq_ps_mask(a,b) _mm512_cmp_ps_mask((a),(b),_CMP_NEQ_UQ) +#endif + +#ifndef _mm512_cmplt_ps_mask +#define _mm512_cmplt_ps_mask(a,b) _mm512_cmp_ps_mask((a),(b),_CMP_LT_OS) +#endif + +#ifndef _mm512_cmplt_pd_mask +#define _mm512_cmplt_pd_mask(a,b) _mm512_cmp_pd_mask((a),(b),_CMP_LT_OS) +#endif +#endif + // // SIMD16 AVX512 (F) implementation // @@ -138,6 +153,17 @@ using SIMD256T = SIMD256Impl::AVX2Impl; } #define SIMD_IWRAPPER_2I(op) SIMD_IWRAPPER_2I_(op, op) +#define SIMD_EMU_IWRAPPER_2(op) \ +static SIMDINLINE \ +Integer SIMDCALL op(Integer a, Integer b)\ +{\ +return Integer\ +{\ +SIMD256T::op(a.v8[0], b.v8[0]),\ +SIMD256T::op(a.v8[1], b.v8[1]),\ +};\ +} + private: static SIMDINLINE Integer vmask(__mmask8 m) { @@ -234,14 +260,6 @@ SIMD_IWRAPPER_1I(slli_epi32); // return a << ImmT SIMD_IWRAPPER_2(sllv_epi32); SIMD_IWRAPPER_1I(srai_epi32); // return a >> ImmT (int32) SIMD_IWRAPPER_1I(srli_epi32); // return a >> ImmT (uint32) -SIMD_IWRAPPER_1I_(srli_si, srli_si512); // return a >> (ImmT*8) (uint) - -template // same as srli_si, but with Float cast to int -static SIMDINLINE Float SIMDCALL srlisi_ps(Float a) -{ -return castsi_ps(srli_si(castps_si(a))); -} - SIMD_IWRAPPER_2(srlv_epi32); //--- @@ -443,10 +461,17 @@ static SIMDINLINE Integer SIMDCALL insert_si(Integer a, SIMD256Impl::Integer b) return _mm512_inserti64x4(a, b, imm); } +#if !defined(AVX512F_STRICT) SIMD_IWRAPPER_2(packs_epi16); // See documentation for _mm512_packs_epi16 and _mm512_packs_epi16 SIMD_IWRAPPER_2(packs_epi32); // See documentation for _mm512_packs_epi32 and _mm512_packs_epi32 SIMD_IWRAPPER_2(packus_epi16); // See documentation for _mm512_packus_epi16 and _mm512_packus_epi16 SIMD_IWRAPPER_2(packus_epi32); // See documentation for _mm512_packus_epi32 and _mm512_packus_epi32 +#else +SIMD_EMU_IWRAPPER_2(packs_epi16) +SIMD_EMU_IWRAPPER_2(packs_epi32) +SIMD_EMU_IWRAPPER_2(packus_epi16) +SIMD_EMU_IWRAPPER_2(packus_epi32) +#endif static SIMDINLINE Integer SIMDCALL permute_epi32(Integer a, Integer swiz) // return a[swiz[i]] for each 32-bit lane i (float) { @@ -679,4 +704,4 @@ static SIMDINLINE Float SIMDCALL vmask_ps(int32_t mask) #undef SIMD_IWRAPPER_2 #undef SIMD_IWRAPPER_2_ #undef SIMD_IWRAPPER_2I - +#undef SIMD_EMU_IWRAPPER_2 diff --git a/src/gallium/drivers/swr/rasterizer/common/simdlib_types.hpp b/src/gallium/drivers/swr/rasterizer/common/simdlib_types.hpp index 07775e7b83..bc23867c7b 100644 --- a/src/gallium/drivers/swr/rasterizer/common/simdlib_types.hpp +++ b/src/gallium/drivers/swr/rasterizer/common/simdlib_types.hpp @@ -262,7 +262,7 @@ namespace SIMDImpl namespace SIMD512Impl { -#if !defined(_MM_K0_REG) +#if !defined(__AVX512F__) // Define AVX512 types if not included via immintrin.h. // All data members of these types are ONLY to viewed // in a debugger. Do NOT access them via code! -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 09/13] i965: Always create the batch with the batch object in the first execobject slot
Quoting Kenneth Graunke (2017-07-19 23:43:04) > On Wednesday, July 19, 2017 3:09:17 AM PDT Chris Wilson wrote: > > Even if we are using older kernels that do not accept the batch in the > > first slot, we can simplify our code by creating the batch with itself > > in the first slot and moving it to the end on execbuf submission. > > --- > > src/mesa/drivers/dri/i965/intel_batchbuffer.c | 70 > > --- > > 1 file changed, 31 insertions(+), 39 deletions(-) > > Alternatively, instead of swapping them out, we could simply add_exec_bo the > batch at the end, and in execbuffer() do: > > if (!use_batch_first) { >execbuf.buffers_ptr++; >execbuf.buffers_count--; > } > > to skip over the batchbuffer entry at the beginning. That seems easier... Ran into trouble with this because of the deduplication we do for batch->exec_bos[]. It kept insisting that I had added the batch first... Doing a swap of first/last validation entry looks more pleasant than the various hacks I have to skip the deduplication, or add a special case add_batch_bo(). -Chris ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] svga: only support 4x, 8x, 16x msaa
Reviewed-by: Charmaine LeeFrom: Brian Paul Sent: Thursday, July 20, 2017 1:54 PM To: mesa-dev@lists.freedesktop.org Cc: Charmaine Lee; Neha Bhende Subject: [PATCH] svga: only support 4x, 8x, 16x msaa Skip 2x MSAA, for example, since it's seldom used and just bloats the list of pixel formats. --- src/gallium/drivers/svga/svga_screen.c | 5 + 1 file changed, 5 insertions(+) diff --git a/src/gallium/drivers/svga/svga_screen.c b/src/gallium/drivers/svga/svga_screen.c index 1ec91e5..77223c9 100644 --- a/src/gallium/drivers/svga/svga_screen.c +++ b/src/gallium/drivers/svga/svga_screen.c @@ -1116,6 +1116,11 @@ svga_screen_create(struct svga_winsys_screen *sws) get_uint_cap(sws, SVGA3D_DEVCAP_MULTISAMPLE_MASKABLESAMPLES, 0); } + /* We only support 4x, 8x, 16x MSAA */ + svgascreen->ms_samples &= ((1 << (4-1)) | + (1 << (8-1)) | + (1 << (16-1))); + /* Maximum number of constant buffers */ svgascreen->max_const_buffers = get_uint_cap(sws, SVGA3D_DEVCAP_DX_MAX_CONSTANT_BUFFERS, 1); -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH V2] i965 : Optimize atom state flag checks
Just some style comments, feel free to ignore them. On 20/07/17 12:35, aravindan.muthuku...@intel.com wrote: From: Aravindan MuthukumarThis patch improves CPI Rate(Cycles per Instruction) and branch mispredict for i965. The function check_state() was showing CPI retired rate. Performance stats with android: CPI retired lowered by 28% (lower is better) Branch missprediction lowered by 13% (lower is better) 3DMark improved by 2% The dissassembly doesn't show difference, although above results were observed with patch. Signed-off-by: Aravindan Muthukumar Signedd-off-by: Yogesh Marathe Tested-by: Asish --- Changes since V1: - Removed memset() change - Changed commit message as per review comments src/mesa/drivers/dri/i965/brw_defines.h | 4 src/mesa/drivers/dri/i965/brw_state_upload.c | 12 2 files changed, 12 insertions(+), 4 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_defines.h b/src/mesa/drivers/dri/i965/brw_defines.h index 2a8dbf8..8c9a510 100644 --- a/src/mesa/drivers/dri/i965/brw_defines.h +++ b/src/mesa/drivers/dri/i965/brw_defines.h @@ -1687,3 +1687,7 @@ enum brw_pixel_shader_coverage_mask_mode { # define CSDBG2_CONSTANT_BUFFER_ADDRESS_OFFSET_DISABLE (1 << 4) #endif + +/* Checking the state of mesa and brw before emitting atoms */ +#define CHECK_BRW_STATE(a,b) ((a.mesa & b.mesa) | (a.brw & b.brw)) + diff --git a/src/mesa/drivers/dri/i965/brw_state_upload.c b/src/mesa/drivers/dri/i965/brw_state_upload.c index acaa97e..1c8b969 100644 --- a/src/mesa/drivers/dri/i965/brw_state_upload.c +++ b/src/mesa/drivers/dri/i965/brw_state_upload.c @@ -443,10 +443,8 @@ check_and_emit_atom(struct brw_context *brw, struct brw_state_flags *state, const struct brw_tracked_state *atom) { - if (check_state(state, >dirty)) { atom->emit(brw); merge_ctx_state(brw, state); - } You might want to re-indent this. Also maybe that function can be rename since it won't check anything anymore. } static inline void @@ -541,7 +539,10 @@ brw_upload_pipeline_state(struct brw_context *brw, const struct brw_tracked_state *atom = [i]; struct brw_state_flags generated; - check_and_emit_atom(brw, , atom); + /* Checking the state and emitting atoms */ + if (CHECK_BRW_STATE(state, atom->dirty)) { +check_and_emit_atom(brw, , atom); + } accumulate_state(, >dirty); @@ -558,7 +559,10 @@ brw_upload_pipeline_state(struct brw_context *brw, for (i = 0; i < num_atoms; i++) { const struct brw_tracked_state *atom = [i]; - check_and_emit_atom(brw, , atom); + /* Checking the state and emitting atoms */ + if (CHECK_BRW_STATE(state, atom->dirty)) { +check_and_emit_atom(brw, , atom); + } } } Why not replacing the last call to check_state() by CHECK_BRW_STATE() and get rid of that function altogether? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] egl/wayland: Fix linking libEGL_common.la
On Thu, Jul 20, 2017 at 1:25 PM, Mike Lothianwrote: > Because libmesautil.la includes string_to_uint_map.o, -lstdc++ is > required for linking to succeed > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101851 > > Signed-off-by: Mike Lothian > --- > src/egl/Makefile.am | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/src/egl/Makefile.am b/src/egl/Makefile.am > index 7c1a4929b8..830ed52b86 100644 > --- a/src/egl/Makefile.am > +++ b/src/egl/Makefile.am > @@ -83,7 +83,7 @@ AM_CFLAGS += $(WAYLAND_CFLAGS) > libEGL_common_la_LIBADD += $(WAYLAND_LIBS) > libEGL_common_la_LIBADD += $(LIBDRM_LIBS) > libEGL_common_la_LIBADD += > $(top_builddir)/src/egl/wayland/wayland-drm/libwayland-drm.la > -libEGL_common_la_LIBADD += $(top_builddir)/src/util/libmesautil.la > +libEGL_common_la_LIBADD += $(top_builddir)/src/util/libmesautil.la -lstdc++ Instead of adding -lstdc++ to LIBADD, you want to use the C++ linker by specifying a dummy cpp file in EXTRA_*_SOURCES. grep for dummy.cpp and you'll fine examples. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] svga: only support 4x, 8x, 16x msaa
Skip 2x MSAA, for example, since it's seldom used and just bloats the list of pixel formats. --- src/gallium/drivers/svga/svga_screen.c | 5 + 1 file changed, 5 insertions(+) diff --git a/src/gallium/drivers/svga/svga_screen.c b/src/gallium/drivers/svga/svga_screen.c index 1ec91e5..77223c9 100644 --- a/src/gallium/drivers/svga/svga_screen.c +++ b/src/gallium/drivers/svga/svga_screen.c @@ -1116,6 +1116,11 @@ svga_screen_create(struct svga_winsys_screen *sws) get_uint_cap(sws, SVGA3D_DEVCAP_MULTISAMPLE_MASKABLESAMPLES, 0); } + /* We only support 4x, 8x, 16x MSAA */ + svgascreen->ms_samples &= ((1 << (4-1)) | + (1 << (8-1)) | + (1 << (16-1))); + /* Maximum number of constant buffers */ svgascreen->max_const_buffers = get_uint_cap(sws, SVGA3D_DEVCAP_DX_MAX_CONSTANT_BUFFERS, 1); -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH V2] i965 : Optimize atom state flag checks
On 07/20/2017 12:57 PM, Marathe, Yogesh wrote: > Ian, > >> -Original Message- >> From: mesa-dev [mailto:mesa-dev-boun...@lists.freedesktop.org] On Behalf >> Of Ian Romanick >> Sent: Friday, July 21, 2017 12:33 AM >> To: Muthukumar, Aravindan; mesa- >> d...@lists.freedesktop.org >> Subject: Re: [Mesa-dev] [PATCH V2] i965 : Optimize atom state flag checks >> >> On 07/20/2017 04:35 AM, aravindan.muthuku...@intel.com wrote: >>> From: Aravindan Muthukumar >>> >>> This patch improves CPI Rate(Cycles per Instruction) and branch >>> mispredict for i965. The function check_state() was showing CPI >>> retired rate. >>> >>> Performance stats with android: >>> CPI retired lowered by 28% (lower is better) Branch missprediction >>> lowered by 13% (lower is better) 3DMark improved by 2% >>> >>> The dissassembly doesn't show difference, although above results were >>> observed with patch. >>> >>> Signed-off-by: Aravindan Muthukumar >>> Signedd-off-by: Yogesh Marathe >> >> Signed-off-by > > Thanks. Will correct it. May I add you and all who commented as Reviewed-by? > I won't make a V3 for this since its a change in commit msg. No. You don't add someone's R-b unless they actually say "Reviewed-by". Various people still have issues with the content of this change. >>> Tested-by: Asish >>> --- >>> >>> Changes since V1: >>> - Removed memset() change >>> - Changed commit message as per review comments >> >> This information should be in the main part of the commit message. >> > > Sure. > >>> >>> src/mesa/drivers/dri/i965/brw_defines.h | 4 >>> src/mesa/drivers/dri/i965/brw_state_upload.c | 12 >>> 2 files changed, 12 insertions(+), 4 deletions(-) >>> >>> diff --git a/src/mesa/drivers/dri/i965/brw_defines.h >>> b/src/mesa/drivers/dri/i965/brw_defines.h >>> index 2a8dbf8..8c9a510 100644 >>> --- a/src/mesa/drivers/dri/i965/brw_defines.h >>> +++ b/src/mesa/drivers/dri/i965/brw_defines.h >>> @@ -1687,3 +1687,7 @@ enum brw_pixel_shader_coverage_mask_mode { # >>> define CSDBG2_CONSTANT_BUFFER_ADDRESS_OFFSET_DISABLE (1 << 4) >>> >>> #endif >>> + >>> +/* Checking the state of mesa and brw before emitting atoms */ >>> +#define CHECK_BRW_STATE(a,b) ((a.mesa & b.mesa) | (a.brw & b.brw)) >>> + >>> diff --git a/src/mesa/drivers/dri/i965/brw_state_upload.c >>> b/src/mesa/drivers/dri/i965/brw_state_upload.c >>> index acaa97e..1c8b969 100644 >>> --- a/src/mesa/drivers/dri/i965/brw_state_upload.c >>> +++ b/src/mesa/drivers/dri/i965/brw_state_upload.c >>> @@ -443,10 +443,8 @@ check_and_emit_atom(struct brw_context *brw, >>> struct brw_state_flags *state, >>> const struct brw_tracked_state *atom) { >>> - if (check_state(state, >dirty)) { >>>atom->emit(brw); >>>merge_ctx_state(brw, state); >>> - } >>> } >>> >>> static inline void >>> @@ -541,7 +539,10 @@ brw_upload_pipeline_state(struct brw_context *brw, >>> const struct brw_tracked_state *atom = [i]; >>> struct brw_state_flags generated; >>> >>> - check_and_emit_atom(brw, , atom); >>> + /* Checking the state and emitting atoms */ >>> + if (CHECK_BRW_STATE(state, atom->dirty)) { >>> +check_and_emit_atom(brw, , atom); >>> + } >>> >>> accumulate_state(, >dirty); >>> >>> @@ -558,7 +559,10 @@ brw_upload_pipeline_state(struct brw_context *brw, >>>for (i = 0; i < num_atoms; i++) { >>> const struct brw_tracked_state *atom = [i]; >>> >>> - check_and_emit_atom(brw, , atom); >>> + /* Checking the state and emitting atoms */ >>> + if (CHECK_BRW_STATE(state, atom->dirty)) { >>> +check_and_emit_atom(brw, , atom); >>> + } >>>} >>> } >>> >>> >> >> ___ >> mesa-dev mailing list >> mesa-dev@lists.freedesktop.org >> https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH V3] i965 : Optimize atom state flag checks
From: Aravindan MuthukumarThis patch improves CPI Rate(Cycles per Instruction) and branch miss predict for i965. The function check_state() was showing CPI retired rate. Performance stats with android: - CPI retired lowered by 28% (lower is better) - Branch missprediction lowered by 13% (lower is better) - 3DMark improved by 2% The dissassembly doesn't show difference, although above results were observed with patch. V2: - Removed memset() change - Changed commit message as per review comments V3: - Indentation in check_and_emit_atom and commit msg corrected Signed-off-by: Aravindan Muthukumar Signed-off-by: Yogesh Marathe Tested-by: Asish --- src/mesa/drivers/dri/i965/brw_defines.h | 4 src/mesa/drivers/dri/i965/brw_state_upload.c | 16 ++-- 2 files changed, 14 insertions(+), 6 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_defines.h b/src/mesa/drivers/dri/i965/brw_defines.h index 2a8dbf8..8c9a510 100644 --- a/src/mesa/drivers/dri/i965/brw_defines.h +++ b/src/mesa/drivers/dri/i965/brw_defines.h @@ -1687,3 +1687,7 @@ enum brw_pixel_shader_coverage_mask_mode { # define CSDBG2_CONSTANT_BUFFER_ADDRESS_OFFSET_DISABLE (1 << 4) #endif + +/* Checking the state of mesa and brw before emitting atoms */ +#define CHECK_BRW_STATE(a,b) ((a.mesa & b.mesa) | (a.brw & b.brw)) + diff --git a/src/mesa/drivers/dri/i965/brw_state_upload.c b/src/mesa/drivers/dri/i965/brw_state_upload.c index acaa97e..57ac394 100644 --- a/src/mesa/drivers/dri/i965/brw_state_upload.c +++ b/src/mesa/drivers/dri/i965/brw_state_upload.c @@ -443,10 +443,8 @@ check_and_emit_atom(struct brw_context *brw, struct brw_state_flags *state, const struct brw_tracked_state *atom) { - if (check_state(state, >dirty)) { - atom->emit(brw); - merge_ctx_state(brw, state); - } + atom->emit(brw); + merge_ctx_state(brw, state); } static inline void @@ -541,7 +539,10 @@ brw_upload_pipeline_state(struct brw_context *brw, const struct brw_tracked_state *atom = [i]; struct brw_state_flags generated; - check_and_emit_atom(brw, , atom); + /* Checking the state and emitting atoms */ + if (CHECK_BRW_STATE(state, atom->dirty)) { +check_and_emit_atom(brw, , atom); + } accumulate_state(, >dirty); @@ -558,7 +559,10 @@ brw_upload_pipeline_state(struct brw_context *brw, for (i = 0; i < num_atoms; i++) { const struct brw_tracked_state *atom = [i]; - check_and_emit_atom(brw, , atom); + /* Checking the state and emitting atoms */ + if (CHECK_BRW_STATE(state, atom->dirty)) { +check_and_emit_atom(brw, , atom); + } } } -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] egl/wayland: Fix linking libEGL_common.la
Because libmesautil.la includes string_to_uint_map.o, -lstdc++ is required for linking to succeed Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101851 Signed-off-by: Mike Lothian--- src/egl/Makefile.am | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/egl/Makefile.am b/src/egl/Makefile.am index 7c1a4929b8..830ed52b86 100644 --- a/src/egl/Makefile.am +++ b/src/egl/Makefile.am @@ -83,7 +83,7 @@ AM_CFLAGS += $(WAYLAND_CFLAGS) libEGL_common_la_LIBADD += $(WAYLAND_LIBS) libEGL_common_la_LIBADD += $(LIBDRM_LIBS) libEGL_common_la_LIBADD += $(top_builddir)/src/egl/wayland/wayland-drm/libwayland-drm.la -libEGL_common_la_LIBADD += $(top_builddir)/src/util/libmesautil.la +libEGL_common_la_LIBADD += $(top_builddir)/src/util/libmesautil.la -lstdc++ dri2_backend_FILES += drivers/dri2/platform_wayland.c \ drivers/dri2/linux-dmabuf-unstable-v1-protocol.c endif -- 2.13.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH V2] i965 : Optimize atom state flag checks
Ian, > -Original Message- > From: mesa-dev [mailto:mesa-dev-boun...@lists.freedesktop.org] On Behalf > Of Ian Romanick > Sent: Friday, July 21, 2017 12:33 AM > To: Muthukumar, Aravindan; mesa- > d...@lists.freedesktop.org > Subject: Re: [Mesa-dev] [PATCH V2] i965 : Optimize atom state flag checks > > On 07/20/2017 04:35 AM, aravindan.muthuku...@intel.com wrote: > > From: Aravindan Muthukumar > > > > This patch improves CPI Rate(Cycles per Instruction) and branch > > mispredict for i965. The function check_state() was showing CPI > > retired rate. > > > > Performance stats with android: > > CPI retired lowered by 28% (lower is better) Branch missprediction > > lowered by 13% (lower is better) 3DMark improved by 2% > > > > The dissassembly doesn't show difference, although above results were > > observed with patch. > > > > Signed-off-by: Aravindan Muthukumar > > Signedd-off-by: Yogesh Marathe > > Signed-off-by Thanks. Will correct it. May I add you and all who commented as Reviewed-by? I won't make a V3 for this since its a change in commit msg. > > > Tested-by: Asish > > --- > > > > Changes since V1: > > - Removed memset() change > > - Changed commit message as per review comments > > This information should be in the main part of the commit message. > Sure. > > > > src/mesa/drivers/dri/i965/brw_defines.h | 4 > > src/mesa/drivers/dri/i965/brw_state_upload.c | 12 > > 2 files changed, 12 insertions(+), 4 deletions(-) > > > > diff --git a/src/mesa/drivers/dri/i965/brw_defines.h > > b/src/mesa/drivers/dri/i965/brw_defines.h > > index 2a8dbf8..8c9a510 100644 > > --- a/src/mesa/drivers/dri/i965/brw_defines.h > > +++ b/src/mesa/drivers/dri/i965/brw_defines.h > > @@ -1687,3 +1687,7 @@ enum brw_pixel_shader_coverage_mask_mode { # > > define CSDBG2_CONSTANT_BUFFER_ADDRESS_OFFSET_DISABLE (1 << 4) > > > > #endif > > + > > +/* Checking the state of mesa and brw before emitting atoms */ > > +#define CHECK_BRW_STATE(a,b) ((a.mesa & b.mesa) | (a.brw & b.brw)) > > + > > diff --git a/src/mesa/drivers/dri/i965/brw_state_upload.c > > b/src/mesa/drivers/dri/i965/brw_state_upload.c > > index acaa97e..1c8b969 100644 > > --- a/src/mesa/drivers/dri/i965/brw_state_upload.c > > +++ b/src/mesa/drivers/dri/i965/brw_state_upload.c > > @@ -443,10 +443,8 @@ check_and_emit_atom(struct brw_context *brw, > > struct brw_state_flags *state, > > const struct brw_tracked_state *atom) { > > - if (check_state(state, >dirty)) { > >atom->emit(brw); > >merge_ctx_state(brw, state); > > - } > > } > > > > static inline void > > @@ -541,7 +539,10 @@ brw_upload_pipeline_state(struct brw_context *brw, > > const struct brw_tracked_state *atom = [i]; > > struct brw_state_flags generated; > > > > - check_and_emit_atom(brw, , atom); > > + /* Checking the state and emitting atoms */ > > + if (CHECK_BRW_STATE(state, atom->dirty)) { > > +check_and_emit_atom(brw, , atom); > > + } > > > > accumulate_state(, >dirty); > > > > @@ -558,7 +559,10 @@ brw_upload_pipeline_state(struct brw_context *brw, > >for (i = 0; i < num_atoms; i++) { > > const struct brw_tracked_state *atom = [i]; > > > > - check_and_emit_atom(brw, , atom); > > + /* Checking the state and emitting atoms */ > > + if (CHECK_BRW_STATE(state, atom->dirty)) { > > +check_and_emit_atom(brw, , atom); > > + } > >} > > } > > > > > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH V2] i965 : Optimize atom state flag checks
Francisco, > -Original Message- > From: Francisco Jerez [mailto:curroje...@riseup.net] > Sent: Friday, July 21, 2017 12:21 AM > To: Marathe, Yogesh; Muthukumar, Aravindan > ; mesa-dev@lists.freedesktop.org > Cc: Muthukumar, Aravindan > Subject: RE: [Mesa-dev] [PATCH V2] i965 : Optimize atom state flag checks > > "Marathe, Yogesh" writes: > > > Francisco, > > > >> -Original Message- > >> From: mesa-dev [mailto:mesa-dev-boun...@lists.freedesktop.org] On > >> Behalf Of Francisco Jerez > >> Sent: Thursday, July 20, 2017 10:51 PM > >> To: Muthukumar, Aravindan ; mesa- > >> d...@lists.freedesktop.org > >> Cc: Muthukumar, Aravindan > >> Subject: Re: [Mesa-dev] [PATCH V2] i965 : Optimize atom state flag > >> checks > >> > >> aravindan.muthuku...@intel.com writes: > >> > >> > From: Aravindan Muthukumar > >> > > >> > This patch improves CPI Rate(Cycles per Instruction) and branch > >> > mispredict for i965. The function check_state() was showing CPI > >> > retired rate. > >> > > >> > Performance stats with android: > >> > CPI retired lowered by 28% (lower is better) Branch missprediction > >> > lowered by 13% (lower is better) 3DMark improved by 2% > >> > > >> > The dissassembly doesn't show difference, although above results > >> > were observed with patch. > >> > > >> > >> How did you determine that your results are not just statistical noise? > > > > No its not statistical noise. As commit msg mentions, we used metrics > > CPI retired rate, utilization, branch miss predict as metrics, these can be > measured using SEP on IA. > > It essentially enables event based sampling and we can measure these through > counters. > > > > How much variance do these metrics have? (particularly the overall score of > the > benchmark) How many times did you run the benchmark? > 2% to be exact, other stats are also present in commit message, the benchmark was run at least 5 times before concluding and more than that during experimenting. > > When we did the analysis of tests we were running, we found > > brw_upload_pipeline_state->check_state functions having bad CPI rates > > and hence we made changed there. The intention was always to reduce > > driver overhead, although this is miniscule effort. > > > >> Did you do some sort of significance testing? Which test, > >> significance level and sample size did you use? > > > > Sorry this is not something we have done, we tested on android > > functionality and perf only. Performance benchmark 3dmark and overall > > stability of the android system were used as tests. Kindly let us know > > if you have specific tests to be run and we would be happy to run that. > > > > What CPU did you get the numbers on? > > > > Broxton. > > > >> > >> Thank you. > >> > >> > Signed-off-by: Aravindan Muthukumar > >> > > >> > Signedd-off-by: Yogesh Marathe > >> > Tested-by: Asish > >> > --- > >> > > >> > Changes since V1: > >> > - Removed memset() change > >> > - Changed commit message as per review comments > >> > > >> > src/mesa/drivers/dri/i965/brw_defines.h | 4 > >> > src/mesa/drivers/dri/i965/brw_state_upload.c | 12 > >> > 2 files changed, 12 insertions(+), 4 deletions(-) > >> > > >> > diff --git a/src/mesa/drivers/dri/i965/brw_defines.h > >> > b/src/mesa/drivers/dri/i965/brw_defines.h > >> > index 2a8dbf8..8c9a510 100644 > >> > --- a/src/mesa/drivers/dri/i965/brw_defines.h > >> > +++ b/src/mesa/drivers/dri/i965/brw_defines.h > >> > @@ -1687,3 +1687,7 @@ enum brw_pixel_shader_coverage_mask_mode > { # > >> > define CSDBG2_CONSTANT_BUFFER_ADDRESS_OFFSET_DISABLE (1 << 4) > >> > > >> > #endif > >> > + > >> > +/* Checking the state of mesa and brw before emitting atoms */ > >> > +#define CHECK_BRW_STATE(a,b) ((a.mesa & b.mesa) | (a.brw & b.brw)) > >> > + > >> > diff --git a/src/mesa/drivers/dri/i965/brw_state_upload.c > >> > b/src/mesa/drivers/dri/i965/brw_state_upload.c > >> > index acaa97e..1c8b969 100644 > >> > --- a/src/mesa/drivers/dri/i965/brw_state_upload.c > >> > +++ b/src/mesa/drivers/dri/i965/brw_state_upload.c > >> > @@ -443,10 +443,8 @@ check_and_emit_atom(struct brw_context *brw, > >> > struct brw_state_flags *state, > >> > const struct brw_tracked_state *atom) { > >> > - if (check_state(state, >dirty)) { > >> >atom->emit(brw); > >> >merge_ctx_state(brw, state); > >> > - } > >> > } > >> > > >> > static inline void > >> > @@ -541,7 +539,10 @@ brw_upload_pipeline_state(struct brw_context > *brw, > >> > const struct brw_tracked_state *atom = [i]; > >> > struct brw_state_flags generated; > >> > > >> > - check_and_emit_atom(brw, , atom); > >> > + /*
Re: [Mesa-dev] [PATCH V2] i965 : Optimize atom state flag checks
On 07/20/2017 11:30 AM, Marathe, Yogesh wrote: > Francisco, > >> -Original Message- >> From: mesa-dev [mailto:mesa-dev-boun...@lists.freedesktop.org] On Behalf >> Of Francisco Jerez >> Sent: Thursday, July 20, 2017 10:51 PM >> To: Muthukumar, Aravindan; mesa- >> d...@lists.freedesktop.org >> Cc: Muthukumar, Aravindan >> Subject: Re: [Mesa-dev] [PATCH V2] i965 : Optimize atom state flag checks >> >> aravindan.muthuku...@intel.com writes: >> >>> From: Aravindan Muthukumar >>> >>> This patch improves CPI Rate(Cycles per Instruction) and branch >>> mispredict for i965. The function check_state() was showing CPI >>> retired rate. >>> >>> Performance stats with android: >>> CPI retired lowered by 28% (lower is better) Branch missprediction >>> lowered by 13% (lower is better) 3DMark improved by 2% >>> >>> The dissassembly doesn't show difference, although above results were >>> observed with patch. >>> >> >> How did you determine that your results are not just statistical noise? > > No its not statistical noise. As commit msg mentions, we used metrics CPI > retired rate, > utilization, branch miss predict as metrics, these can be measured using SEP > on IA. > It essentially enables event based sampling and we can measure these through > counters. > > When we did the analysis of tests we were running, we found > brw_upload_pipeline_state->check_state functions having bad CPI rates and > hence > we made changed there. The intention was always to reduce driver overhead, > although > this is miniscule effort. > >> Did you do some sort of significance testing? Which test, significance >> level and >> sample size did you use? > > Sorry this is not something we have done, we tested on android functionality > and > perf only. Performance benchmark 3dmark and overall stability of the android > system > were used as tests. Kindly let us know if you have specific tests to be run > and we would > be happy to run that. All of the benchmarks have variation in framerate. In order to get trustworthy data, you have to run the benchmark multiple times, alternating "before" and "after," and perform statistical analysis on the results. Generally Student's t is used. See http://anholt.net/compare-perf/ for more details. You should perform similar analysis on the CPU metric. Other activity in the system can affect these files. > What CPU did you get the numbers on? > > Broxton. > >> >> Thank you. >> >>> Signed-off-by: Aravindan Muthukumar >>> Signedd-off-by: Yogesh Marathe >>> Tested-by: Asish >>> --- >>> >>> Changes since V1: >>> - Removed memset() change >>> - Changed commit message as per review comments >>> >>> src/mesa/drivers/dri/i965/brw_defines.h | 4 >>> src/mesa/drivers/dri/i965/brw_state_upload.c | 12 >>> 2 files changed, 12 insertions(+), 4 deletions(-) >>> >>> diff --git a/src/mesa/drivers/dri/i965/brw_defines.h >>> b/src/mesa/drivers/dri/i965/brw_defines.h >>> index 2a8dbf8..8c9a510 100644 >>> --- a/src/mesa/drivers/dri/i965/brw_defines.h >>> +++ b/src/mesa/drivers/dri/i965/brw_defines.h >>> @@ -1687,3 +1687,7 @@ enum brw_pixel_shader_coverage_mask_mode { # >>> define CSDBG2_CONSTANT_BUFFER_ADDRESS_OFFSET_DISABLE (1 << 4) >>> >>> #endif >>> + >>> +/* Checking the state of mesa and brw before emitting atoms */ >>> +#define CHECK_BRW_STATE(a,b) ((a.mesa & b.mesa) | (a.brw & b.brw)) >>> + >>> diff --git a/src/mesa/drivers/dri/i965/brw_state_upload.c >>> b/src/mesa/drivers/dri/i965/brw_state_upload.c >>> index acaa97e..1c8b969 100644 >>> --- a/src/mesa/drivers/dri/i965/brw_state_upload.c >>> +++ b/src/mesa/drivers/dri/i965/brw_state_upload.c >>> @@ -443,10 +443,8 @@ check_and_emit_atom(struct brw_context *brw, >>> struct brw_state_flags *state, >>> const struct brw_tracked_state *atom) { >>> - if (check_state(state, >dirty)) { >>>atom->emit(brw); >>>merge_ctx_state(brw, state); >>> - } >>> } >>> >>> static inline void >>> @@ -541,7 +539,10 @@ brw_upload_pipeline_state(struct brw_context *brw, >>> const struct brw_tracked_state *atom = [i]; >>> struct brw_state_flags generated; >>> >>> - check_and_emit_atom(brw, , atom); >>> + /* Checking the state and emitting atoms */ >>> + if (CHECK_BRW_STATE(state, atom->dirty)) { >>> +check_and_emit_atom(brw, , atom); >>> + } >>> >>> accumulate_state(, >dirty); >>> >>> @@ -558,7 +559,10 @@ brw_upload_pipeline_state(struct brw_context *brw, >>>for (i = 0; i < num_atoms; i++) { >>> const struct brw_tracked_state *atom = [i]; >>> >>> - check_and_emit_atom(brw, , atom); >>> + /* Checking the state and emitting atoms */ >>> + if (CHECK_BRW_STATE(state, atom->dirty)) { >>> +
[Mesa-dev] [PATCH] i965: Push no_hw down to the execbuf call
For the common path where we want to execute the batch, if we push the no_hw detection down to the execbuf we can eliminate one loop over all the execobjects. For the less common path where we don't want to execute the batch, no_hw was leaving out_fence uninitialised. To simplify later changes, the execbuf routine was then inlined into its only caller. Cc: Kenneth GraunkeCc: Matt Turner --- src/mesa/drivers/dri/i965/intel_batchbuffer.c | 134 +++--- 1 file changed, 56 insertions(+), 78 deletions(-) diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.c b/src/mesa/drivers/dri/i965/intel_batchbuffer.c index 4461a59b80..59d95c4e66 100644 --- a/src/mesa/drivers/dri/i965/intel_batchbuffer.c +++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.c @@ -383,13 +383,6 @@ static void do_batch_dump(struct brw_context *brw) { } static void brw_new_batch(struct brw_context *brw) { - /* Unreference any BOs held by the previous batch, and reset counts. */ - for (int i = 0; i < brw->batch.exec_count; i++) { - if (brw->batch.exec_bos[i] != brw->batch.bo) { - brw_bo_unreference(brw->batch.exec_bos[i]); - } - brw->batch.exec_bos[i] = NULL; - } brw->batch.reloc_count = 0; brw->batch.exec_count = 0; brw->batch.aperture_space = BATCH_SZ; @@ -559,63 +552,8 @@ add_exec_bo(struct intel_batchbuffer *batch, struct brw_bo *bo) } static int -execbuffer(int fd, - struct intel_batchbuffer *batch, - uint32_t ctx_id, - int used, - int in_fence, - int *out_fence, - int flags) -{ - struct drm_i915_gem_execbuffer2 execbuf = { - .buffers_ptr = (uintptr_t) batch->validation_list, - .buffer_count = batch->exec_count, - .batch_start_offset = 0, - .batch_len = used, - .flags = flags, - .rsvd1 = ctx_id, /* rsvd1 is actually the context ID */ - }; - - unsigned long cmd = DRM_IOCTL_I915_GEM_EXECBUFFER2; - - if (in_fence != -1) { - execbuf.rsvd2 = in_fence; - execbuf.flags |= I915_EXEC_FENCE_IN; - } - - if (out_fence != NULL) { - cmd = DRM_IOCTL_I915_GEM_EXECBUFFER2_WR; - *out_fence = -1; - execbuf.flags |= I915_EXEC_FENCE_OUT; - } - - int ret = drmIoctl(fd, cmd, ); - if (ret != 0) - ret = -errno; - - for (int i = 0; i < batch->exec_count; i++) { - struct brw_bo *bo = batch->exec_bos[i]; - - bo->idle = false; - - /* Update brw_bo::offset64 */ - if (batch->validation_list[i].offset != bo->offset64) { - DBG("BO %d migrated: 0x%" PRIx64 " -> 0x%llx\n", - bo->gem_handle, bo->offset64, batch->validation_list[i].offset); - bo->offset64 = batch->validation_list[i].offset; - } - } - - if (ret == 0 && out_fence != NULL) - *out_fence = execbuf.rsvd2 >> 32; - - return ret; -} - -static int do_flush_locked(struct brw_context *brw, int in_fence_fd, int *out_fence_fd) { - __DRIscreen *dri_screen = brw->screen->driScrnPriv; struct intel_batchbuffer *batch = >batch; int ret = 0; @@ -624,35 +562,75 @@ do_flush_locked(struct brw_context *brw, int in_fence_fd, int *out_fence_fd) } else { ret = brw_bo_subdata(batch->bo, 0, 4 * USED_BATCH(*batch), batch->map); if (ret == 0 && batch->state_batch_offset != batch->bo->size) { -ret = brw_bo_subdata(batch->bo, - batch->state_batch_offset, - batch->bo->size - batch->state_batch_offset, - (char *)batch->map + batch->state_batch_offset); + ret = brw_bo_subdata(batch->bo, + batch->state_batch_offset, + batch->bo->size - batch->state_batch_offset, + (char *)batch->map + batch->state_batch_offset); } } - if (!brw->screen->no_hw) { - int flags; + if (ret == 0) { + /* Add the batch itself to the end of the validation list */ + add_exec_bo(batch, batch->bo); + + struct drm_i915_gem_execbuffer2 execbuf = { + .buffers_ptr = (uintptr_t) batch->validation_list, + .buffer_count = batch->exec_count, + .batch_len = 4 * USED_BATCH(*batch), + /* rsvd1 is actually the context ID */ + .rsvd1 = batch->ring == RENDER_RING ? brw->hw_ctx : 0, + }; if (brw->gen >= 6 && batch->ring == BLT_RING) { - flags = I915_EXEC_BLT; + execbuf.flags = I915_EXEC_BLT; } else { - flags = I915_EXEC_RENDER; + execbuf.flags = I915_EXEC_RENDER; } if (batch->needs_sol_reset) -flags |= I915_EXEC_GEN7_SOL_RESET; + execbuf.flags |= I915_EXEC_GEN7_SOL_RESET; - if (ret == 0) { - uint32_t hw_ctx = batch->ring == RENDER_RING ? brw->hw_ctx : 0; + unsigned long cmd = DRM_IOCTL_I915_GEM_EXECBUFFER2; - /* Add the batch itself to the end of the
Re: [Mesa-dev] [PATCH V2] i965 : Optimize atom state flag checks
On 07/20/2017 04:35 AM, aravindan.muthuku...@intel.com wrote: > From: Aravindan Muthukumar> > This patch improves CPI Rate(Cycles per Instruction) > and branch mispredict for i965. The function check_state() > was showing CPI retired rate. > > Performance stats with android: > CPI retired lowered by 28% (lower is better) > Branch missprediction lowered by 13% (lower is better) > 3DMark improved by 2% > > The dissassembly doesn't show difference, although above > results were observed with patch. > > Signed-off-by: Aravindan Muthukumar > Signedd-off-by: Yogesh Marathe Signed-off-by > Tested-by: Asish > --- > > Changes since V1: > - Removed memset() change > - Changed commit message as per review comments This information should be in the main part of the commit message. > > src/mesa/drivers/dri/i965/brw_defines.h | 4 > src/mesa/drivers/dri/i965/brw_state_upload.c | 12 > 2 files changed, 12 insertions(+), 4 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_defines.h > b/src/mesa/drivers/dri/i965/brw_defines.h > index 2a8dbf8..8c9a510 100644 > --- a/src/mesa/drivers/dri/i965/brw_defines.h > +++ b/src/mesa/drivers/dri/i965/brw_defines.h > @@ -1687,3 +1687,7 @@ enum brw_pixel_shader_coverage_mask_mode { > # define CSDBG2_CONSTANT_BUFFER_ADDRESS_OFFSET_DISABLE (1 << 4) > > #endif > + > +/* Checking the state of mesa and brw before emitting atoms */ > +#define CHECK_BRW_STATE(a,b) ((a.mesa & b.mesa) | (a.brw & b.brw)) > + > diff --git a/src/mesa/drivers/dri/i965/brw_state_upload.c > b/src/mesa/drivers/dri/i965/brw_state_upload.c > index acaa97e..1c8b969 100644 > --- a/src/mesa/drivers/dri/i965/brw_state_upload.c > +++ b/src/mesa/drivers/dri/i965/brw_state_upload.c > @@ -443,10 +443,8 @@ check_and_emit_atom(struct brw_context *brw, > struct brw_state_flags *state, > const struct brw_tracked_state *atom) > { > - if (check_state(state, >dirty)) { >atom->emit(brw); >merge_ctx_state(brw, state); > - } > } > > static inline void > @@ -541,7 +539,10 @@ brw_upload_pipeline_state(struct brw_context *brw, >const struct brw_tracked_state *atom = [i]; >struct brw_state_flags generated; > > - check_and_emit_atom(brw, , atom); > + /* Checking the state and emitting atoms */ > + if (CHECK_BRW_STATE(state, atom->dirty)) { > +check_and_emit_atom(brw, , atom); > + } > >accumulate_state(, >dirty); > > @@ -558,7 +559,10 @@ brw_upload_pipeline_state(struct brw_context *brw, >for (i = 0; i < num_atoms; i++) { >const struct brw_tracked_state *atom = [i]; > > - check_and_emit_atom(brw, , atom); > + /* Checking the state and emitting atoms */ > + if (CHECK_BRW_STATE(state, atom->dirty)) { > +check_and_emit_atom(brw, , atom); > + } >} > } > > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH V2] i965 : Optimize atom state flag checks
"Marathe, Yogesh"writes: > Francisco, > >> -Original Message- >> From: mesa-dev [mailto:mesa-dev-boun...@lists.freedesktop.org] On Behalf >> Of Francisco Jerez >> Sent: Thursday, July 20, 2017 10:51 PM >> To: Muthukumar, Aravindan ; mesa- >> d...@lists.freedesktop.org >> Cc: Muthukumar, Aravindan >> Subject: Re: [Mesa-dev] [PATCH V2] i965 : Optimize atom state flag checks >> >> aravindan.muthuku...@intel.com writes: >> >> > From: Aravindan Muthukumar >> > >> > This patch improves CPI Rate(Cycles per Instruction) and branch >> > mispredict for i965. The function check_state() was showing CPI >> > retired rate. >> > >> > Performance stats with android: >> > CPI retired lowered by 28% (lower is better) Branch missprediction >> > lowered by 13% (lower is better) 3DMark improved by 2% >> > >> > The dissassembly doesn't show difference, although above results were >> > observed with patch. >> > >> >> How did you determine that your results are not just statistical noise? > > No its not statistical noise. As commit msg mentions, we used metrics CPI > retired rate, > utilization, branch miss predict as metrics, these can be measured using SEP > on IA. > It essentially enables event based sampling and we can measure these through > counters. > How much variance do these metrics have? (particularly the overall score of the benchmark) How many times did you run the benchmark? > When we did the analysis of tests we were running, we found > brw_upload_pipeline_state->check_state functions having bad CPI rates and > hence > we made changed there. The intention was always to reduce driver overhead, > although > this is miniscule effort. > >> Did you do some sort of significance testing? Which test, significance >> level and >> sample size did you use? > > Sorry this is not something we have done, we tested on android functionality > and > perf only. Performance benchmark 3dmark and overall stability of the android > system > were used as tests. Kindly let us know if you have specific tests to be run > and we would > be happy to run that. > > What CPU did you get the numbers on? > > Broxton. > >> >> Thank you. >> >> > Signed-off-by: Aravindan Muthukumar >> > Signedd-off-by: Yogesh Marathe >> > Tested-by: Asish >> > --- >> > >> > Changes since V1: >> > - Removed memset() change >> > - Changed commit message as per review comments >> > >> > src/mesa/drivers/dri/i965/brw_defines.h | 4 >> > src/mesa/drivers/dri/i965/brw_state_upload.c | 12 >> > 2 files changed, 12 insertions(+), 4 deletions(-) >> > >> > diff --git a/src/mesa/drivers/dri/i965/brw_defines.h >> > b/src/mesa/drivers/dri/i965/brw_defines.h >> > index 2a8dbf8..8c9a510 100644 >> > --- a/src/mesa/drivers/dri/i965/brw_defines.h >> > +++ b/src/mesa/drivers/dri/i965/brw_defines.h >> > @@ -1687,3 +1687,7 @@ enum brw_pixel_shader_coverage_mask_mode { # >> > define CSDBG2_CONSTANT_BUFFER_ADDRESS_OFFSET_DISABLE (1 << 4) >> > >> > #endif >> > + >> > +/* Checking the state of mesa and brw before emitting atoms */ >> > +#define CHECK_BRW_STATE(a,b) ((a.mesa & b.mesa) | (a.brw & b.brw)) >> > + >> > diff --git a/src/mesa/drivers/dri/i965/brw_state_upload.c >> > b/src/mesa/drivers/dri/i965/brw_state_upload.c >> > index acaa97e..1c8b969 100644 >> > --- a/src/mesa/drivers/dri/i965/brw_state_upload.c >> > +++ b/src/mesa/drivers/dri/i965/brw_state_upload.c >> > @@ -443,10 +443,8 @@ check_and_emit_atom(struct brw_context *brw, >> > struct brw_state_flags *state, >> > const struct brw_tracked_state *atom) { >> > - if (check_state(state, >dirty)) { >> >atom->emit(brw); >> >merge_ctx_state(brw, state); >> > - } >> > } >> > >> > static inline void >> > @@ -541,7 +539,10 @@ brw_upload_pipeline_state(struct brw_context *brw, >> > const struct brw_tracked_state *atom = [i]; >> > struct brw_state_flags generated; >> > >> > - check_and_emit_atom(brw, , atom); >> > + /* Checking the state and emitting atoms */ >> > + if (CHECK_BRW_STATE(state, atom->dirty)) { >> > +check_and_emit_atom(brw, , atom); >> > + } >> > >> > accumulate_state(, >dirty); >> > >> > @@ -558,7 +559,10 @@ brw_upload_pipeline_state(struct brw_context *brw, >> >for (i = 0; i < num_atoms; i++) { >> > const struct brw_tracked_state *atom = [i]; >> > >> > - check_and_emit_atom(brw, , atom); >> > + /* Checking the state and emitting atoms */ >> > + if (CHECK_BRW_STATE(state, atom->dirty)) { >> > +check_and_emit_atom(brw, , atom); >> > + } >> >} >> > } >> > >> > -- >> > 2.7.4 >> > >> > ___ >> > mesa-dev mailing list >> >
Re: [Mesa-dev] [PATCH mesa 4/5] wayland-egl: Make wl_egl_window a versioned struct
On 20 July 2017 at 01:38, Miguel Angel Vicowrote: > > > On Wed, 19 Jul 2017 12:06:06 +0100 > Emil Velikov wrote: > >> On 18 July 2017 at 21:49, Miguel A. Vico wrote: >> > We need wl_egl_window to be a versioned struct in order to keep track of >> > ABI changes. >> > >> > This change makes the first member of wl_egl_window the version number. >> > >> > An heuristic in the wayland driver is added so that we don't break >> > backwards compatibility: >> > >> > - If the first field (version) is an actual pointer, it is an old >> >implementation of wl_egl_window, and version points to the wl_surface >> >proxy. >> > >> > - Else, the first field is the version number, and we have >> >wl_egl_window::surface pointing to the wl_surface proxy. >> > >> > Signed-off-by: Miguel A. Vico >> > Reviewed-by: James Jones >> >> This commit will cause a break in the ABI checker. Yet again, I'm >> short on ideas how to avoid that :-( > > Yeah... The only think I can think of is pushing both this and 5/5 as a > single commit. > > I usually like to keep things separate. Is it much of a deal given that > they'll go in at the same time? > I'm inclined to keep them separate as well. >> >> > --- >> > src/egl/drivers/dri2/platform_wayland.c| 13 - >> > src/egl/wayland/wayland-egl/wayland-egl-priv.h | 6 +- >> > src/egl/wayland/wayland-egl/wayland-egl.c | 6 +- >> > 3 files changed, 22 insertions(+), 3 deletions(-) >> > >> > diff --git a/src/egl/drivers/dri2/platform_wayland.c >> > b/src/egl/drivers/dri2/platform_wayland.c >> > index ee68284217..0f0a12fd80 100644 >> > --- a/src/egl/drivers/dri2/platform_wayland.c >> > +++ b/src/egl/drivers/dri2/platform_wayland.c >> > @@ -41,6 +41,7 @@ >> > #include "egl_dri2.h" >> > #include "egl_dri2_fallbacks.h" >> > #include "loader.h" >> > +#include "eglglobals.h" >> > >> > #include >> > #include "wayland-drm-client-protocol.h" >> > @@ -100,6 +101,16 @@ destroy_window_callback(void *data) >> > dri2_surf->wl_win = NULL; >> > } >> > >> > +static struct wl_surface * >> > +get_wl_surface_proxy(struct wl_egl_window *window) >> > +{ >> > + if (_eglPointerIsDereferencable((void *)(window->version))) { >> > + /* window->version points to actual wl_surface data */ >> > + return wl_proxy_create_wrapper((void *)(window->version)); >> > + } >> Please add a comment in there. I'm thinking about the following >> although use whatever you prefer. >> >> Version 3 of wl_egl_window introduced a version field, at the same >> location where a pointer to wl_surface was stored. > > Done. > >> >> > + return wl_proxy_create_wrapper(window->surface); >> > +} >> > + >> > /** >> > * Called via eglCreateWindowSurface(), drv->API.CreateWindowSurface(). >> > */ >> > @@ -171,7 +182,7 @@ dri2_wl_create_window_surface(_EGLDriver *drv, >> > _EGLDisplay *disp, >> > wl_proxy_set_queue((struct wl_proxy *)dri2_surf->wl_dpy_wrapper, >> >dri2_surf->wl_queue); >> > >> > - dri2_surf->wl_surface_wrapper = >> > wl_proxy_create_wrapper(window->surface); >> > + dri2_surf->wl_surface_wrapper = get_wl_surface_proxy(window); >> > if (!dri2_surf->wl_surface_wrapper) { >> >_eglError(EGL_BAD_ALLOC, "dri2_create_surface"); >> >goto cleanup_drm; >> > diff --git a/src/egl/wayland/wayland-egl/wayland-egl-priv.h >> > b/src/egl/wayland/wayland-egl/wayland-egl-priv.h >> > index 92c31d9454..3b59908cc1 100644 >> > --- a/src/egl/wayland/wayland-egl/wayland-egl-priv.h >> > +++ b/src/egl/wayland/wayland-egl/wayland-egl-priv.h >> > @@ -41,8 +41,10 @@ >> > extern "C" { >> > #endif >> > >> > +#define WL_EGL_WINDOW_VERSION 3 >> > + >> > struct wl_egl_window { >> > - struct wl_surface *surface; >> > + const intptr_t version; >> > >> > int width; >> > int height; >> > @@ -55,6 +57,8 @@ struct wl_egl_window { >> > void *private; >> > void (*resize_callback)(struct wl_egl_window *, void *); >> > void (*destroy_window_callback)(void *); >> > + >> > + struct wl_surface *surface; >> > }; >> > >> > #ifdef __cplusplus >> > diff --git a/src/egl/wayland/wayland-egl/wayland-egl.c >> > b/src/egl/wayland/wayland-egl/wayland-egl.c >> > index 4a4701a2de..02645549e0 100644 >> > --- a/src/egl/wayland/wayland-egl/wayland-egl.c >> > +++ b/src/egl/wayland/wayland-egl/wayland-egl.c >> > @@ -28,6 +28,7 @@ >> > */ >> > >> > #include >> > +#include >> > >> > #include >> > #include "wayland-egl.h" >> > @@ -54,6 +55,7 @@ WL_EGL_EXPORT struct wl_egl_window * >> > wl_egl_window_create(struct wl_surface *surface, >> > int width, int height) >> > { >> > + struct wl_egl_window _INIT_ = { .version = WL_EGL_WINDOW_VERSION }; >> > struct wl_egl_window *egl_window; >> > >> > if (width <= 0 || height <= 0) >> > @@ -63,6 +65,8 @@
Re: [Mesa-dev] [PATCH V2] i965 : Optimize atom state flag checks
Francisco, > -Original Message- > From: mesa-dev [mailto:mesa-dev-boun...@lists.freedesktop.org] On Behalf > Of Francisco Jerez > Sent: Thursday, July 20, 2017 10:51 PM > To: Muthukumar, Aravindan; mesa- > d...@lists.freedesktop.org > Cc: Muthukumar, Aravindan > Subject: Re: [Mesa-dev] [PATCH V2] i965 : Optimize atom state flag checks > > aravindan.muthuku...@intel.com writes: > > > From: Aravindan Muthukumar > > > > This patch improves CPI Rate(Cycles per Instruction) and branch > > mispredict for i965. The function check_state() was showing CPI > > retired rate. > > > > Performance stats with android: > > CPI retired lowered by 28% (lower is better) Branch missprediction > > lowered by 13% (lower is better) 3DMark improved by 2% > > > > The dissassembly doesn't show difference, although above results were > > observed with patch. > > > > How did you determine that your results are not just statistical noise? No its not statistical noise. As commit msg mentions, we used metrics CPI retired rate, utilization, branch miss predict as metrics, these can be measured using SEP on IA. It essentially enables event based sampling and we can measure these through counters. When we did the analysis of tests we were running, we found brw_upload_pipeline_state->check_state functions having bad CPI rates and hence we made changed there. The intention was always to reduce driver overhead, although this is miniscule effort. > Did you do some sort of significance testing? Which test, significance level > and > sample size did you use? Sorry this is not something we have done, we tested on android functionality and perf only. Performance benchmark 3dmark and overall stability of the android system were used as tests. Kindly let us know if you have specific tests to be run and we would be happy to run that. What CPU did you get the numbers on? Broxton. > > Thank you. > > > Signed-off-by: Aravindan Muthukumar > > Signedd-off-by: Yogesh Marathe > > Tested-by: Asish > > --- > > > > Changes since V1: > > - Removed memset() change > > - Changed commit message as per review comments > > > > src/mesa/drivers/dri/i965/brw_defines.h | 4 > > src/mesa/drivers/dri/i965/brw_state_upload.c | 12 > > 2 files changed, 12 insertions(+), 4 deletions(-) > > > > diff --git a/src/mesa/drivers/dri/i965/brw_defines.h > > b/src/mesa/drivers/dri/i965/brw_defines.h > > index 2a8dbf8..8c9a510 100644 > > --- a/src/mesa/drivers/dri/i965/brw_defines.h > > +++ b/src/mesa/drivers/dri/i965/brw_defines.h > > @@ -1687,3 +1687,7 @@ enum brw_pixel_shader_coverage_mask_mode { # > > define CSDBG2_CONSTANT_BUFFER_ADDRESS_OFFSET_DISABLE (1 << 4) > > > > #endif > > + > > +/* Checking the state of mesa and brw before emitting atoms */ > > +#define CHECK_BRW_STATE(a,b) ((a.mesa & b.mesa) | (a.brw & b.brw)) > > + > > diff --git a/src/mesa/drivers/dri/i965/brw_state_upload.c > > b/src/mesa/drivers/dri/i965/brw_state_upload.c > > index acaa97e..1c8b969 100644 > > --- a/src/mesa/drivers/dri/i965/brw_state_upload.c > > +++ b/src/mesa/drivers/dri/i965/brw_state_upload.c > > @@ -443,10 +443,8 @@ check_and_emit_atom(struct brw_context *brw, > > struct brw_state_flags *state, > > const struct brw_tracked_state *atom) { > > - if (check_state(state, >dirty)) { > >atom->emit(brw); > >merge_ctx_state(brw, state); > > - } > > } > > > > static inline void > > @@ -541,7 +539,10 @@ brw_upload_pipeline_state(struct brw_context *brw, > > const struct brw_tracked_state *atom = [i]; > > struct brw_state_flags generated; > > > > - check_and_emit_atom(brw, , atom); > > + /* Checking the state and emitting atoms */ > > + if (CHECK_BRW_STATE(state, atom->dirty)) { > > +check_and_emit_atom(brw, , atom); > > + } > > > > accumulate_state(, >dirty); > > > > @@ -558,7 +559,10 @@ brw_upload_pipeline_state(struct brw_context *brw, > >for (i = 0; i < num_atoms; i++) { > > const struct brw_tracked_state *atom = [i]; > > > > - check_and_emit_atom(brw, , atom); > > + /* Checking the state and emitting atoms */ > > + if (CHECK_BRW_STATE(state, atom->dirty)) { > > +check_and_emit_atom(brw, , atom); > > + } > >} > > } > > > > -- > > 2.7.4 > > > > ___ > > mesa-dev mailing list > > mesa-dev@lists.freedesktop.org > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] st/mesa: add destroy_drawable interface
With this patch, the st manager will maintain a hash table for the active framebuffer interface objects. A destroy_drawable interface is added to allow the state tracker to notify the st manager to remove the associated framebuffer interface object from the hash table, so the associated framebuffer and its resources can be deleted at framebuffers purge time. Fixes bug 101829 "read-after-free in st_framebuffer_validate" Tested-by: Brad KingTested-by: Gert Wollny --- src/gallium/include/state_tracker/st_api.h| 7 ++ src/gallium/state_trackers/dri/dri_drawable.c | 6 +- src/gallium/state_trackers/glx/xlib/xm_api.c | 5 ++ src/gallium/state_trackers/glx/xlib/xm_st.c | 2 + src/gallium/state_trackers/wgl/stw_st.c | 6 +- src/mesa/state_tracker/st_manager.c | 95 ++- src/mesa/state_tracker/st_manager.h | 5 ++ 7 files changed, 123 insertions(+), 3 deletions(-) diff --git a/src/gallium/include/state_tracker/st_api.h b/src/gallium/include/state_tracker/st_api.h index 30a4866..9b660f7 100644 --- a/src/gallium/include/state_tracker/st_api.h +++ b/src/gallium/include/state_tracker/st_api.h @@ -552,6 +552,13 @@ struct st_api * Get the currently bound context in the calling thread. */ struct st_context_iface *(*get_current)(struct st_api *stapi); + + /** +* Notify the st manager the framebuffer interface object +* is no longer valid. +*/ + void (*destroy_drawable)(struct st_api *stapi, +struct st_framebuffer_iface *stfbi); }; /** diff --git a/src/gallium/state_trackers/dri/dri_drawable.c b/src/gallium/state_trackers/dri/dri_drawable.c index 0cfdc30..c7df0f6 100644 --- a/src/gallium/state_trackers/dri/dri_drawable.c +++ b/src/gallium/state_trackers/dri/dri_drawable.c @@ -169,6 +169,8 @@ void dri_destroy_buffer(__DRIdrawable * dPriv) { struct dri_drawable *drawable = dri_drawable(dPriv); + struct dri_screen *screen = drawable->screen; + struct st_api *stapi = screen->st_api; int i; pipe_surface_reference(>drisw_surface, NULL); @@ -180,7 +182,9 @@ dri_destroy_buffer(__DRIdrawable * dPriv) swap_fences_unref(drawable); - drawable->base.ID = 0; + /* Notify the st manager that this drawable is no longer valid */ + stapi->destroy_drawable(stapi, >base); + FREE(drawable); } diff --git a/src/gallium/state_trackers/glx/xlib/xm_api.c b/src/gallium/state_trackers/glx/xlib/xm_api.c index 881dd44..e4b1e9d 100644 --- a/src/gallium/state_trackers/glx/xlib/xm_api.c +++ b/src/gallium/state_trackers/glx/xlib/xm_api.c @@ -595,6 +595,11 @@ xmesa_free_buffer(XMesaBuffer buffer) */ b->ws.drawable = 0; + /* Notify the st manager that the associated framebuffer interface + * object is no longer valid. + */ + stapi->destroy_drawable(stapi, buffer->stfb); + /* XXX we should move the buffer to a delete-pending list and destroy * the buffer until it is no longer current. */ diff --git a/src/gallium/state_trackers/glx/xlib/xm_st.c b/src/gallium/state_trackers/glx/xlib/xm_st.c index 9e30efa..6a0f4aa 100644 --- a/src/gallium/state_trackers/glx/xlib/xm_st.c +++ b/src/gallium/state_trackers/glx/xlib/xm_st.c @@ -273,6 +273,7 @@ xmesa_st_framebuffer_flush_front(struct st_context_iface *stctx, return ret; } +static uint32_t xmesa_stfbi_ID = 0; struct st_framebuffer_iface * xmesa_create_st_framebuffer(XMesaDisplay xmdpy, XMesaBuffer b) @@ -302,6 +303,7 @@ xmesa_create_st_framebuffer(XMesaDisplay xmdpy, XMesaBuffer b) stfbi->visual = >stvis; stfbi->flush_front = xmesa_st_framebuffer_flush_front; stfbi->validate = xmesa_st_framebuffer_validate; + stfbi->ID = p_atomic_inc_return(_stfbi_ID); p_atomic_set(>stamp, 1); stfbi->st_manager_private = (void *) xstfb; diff --git a/src/gallium/state_trackers/wgl/stw_st.c b/src/gallium/state_trackers/wgl/stw_st.c index c2844b0..85a8b17 100644 --- a/src/gallium/state_trackers/wgl/stw_st.c +++ b/src/gallium/state_trackers/wgl/stw_st.c @@ -256,7 +256,11 @@ stw_st_destroy_framebuffer_locked(struct st_framebuffer_iface *stfb) for (i = 0; i < ST_ATTACHMENT_COUNT; i++) pipe_resource_reference(>textures[i], NULL); - stwfb->base.ID = 0; + /* Notify the st manager that the framebuffer interface is no +* longer valid. +*/ + stw_dev->stapi->destroy_drawable(stw_dev->stapi, >base); + FREE(stwfb); } diff --git a/src/mesa/state_tracker/st_manager.c b/src/mesa/state_tracker/st_manager.c index cb816de..ebc7ca8 100644 --- a/src/mesa/state_tracker/st_manager.c +++ b/src/mesa/state_tracker/st_manager.c @@ -38,6 +38,7 @@ #include "main/fbobject.h" #include "main/renderbuffer.h" #include "main/version.h" +#include "util/hash_table.h" #include "st_texture.h" #include "st_context.h" @@ -59,6 +60,10 @@ #include "util/u_surface.h" #include "util/list.h"
[Mesa-dev] [Bug 101844] Artifacts in form of wrong pixels appearing on some surfaces
https://bugs.freedesktop.org/show_bug.cgi?id=101844 --- Comment #7 from Fabian Maurer--- Sorry, look like I was wrong. There seem to always have been artifacts, but since a newer Minecraft version they're colorful instead of black, making them way more noticeable. "LIBGL_ALWAYS_SOFTWARE=1" reduces the artifacts massively, and "GALLIUM_DRIVER=swr LIBGL_ALWAYS_SOFTWARE=1" removes them. Now, since I can't do a regression test, is there another way I could help? -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 08/13] i965: Convert reloc.target_handle into an index for I915_EXEC_HANDLE_LUT
Quoting Kenneth Graunke (2017-07-20 17:57:22) > On Thursday, July 20, 2017 8:05:19 AM PDT Chris Wilson wrote: > > Quoting Kenneth Graunke (2017-07-19 23:36:58) > > > On Wednesday, July 19, 2017 3:09:16 AM PDT Chris Wilson wrote: > > > > #define READ_ONCE(x) (*(volatile __typeof__(x) *)&(x)) > > > > @@ -117,21 +125,12 @@ add_exec_bo(struct intel_batchbuffer *batch, > > > > struct brw_bo *bo) > > > > batch->exec_array_size * > > > > sizeof(batch->exec_objects[0])); > > > > } > > > > > > > > - struct drm_i915_gem_exec_object2 *validation_entry = > > > > - >exec_objects[batch->exec_count]; > > > > - validation_entry->handle = bo->gem_handle; > > > > - if (bo == batch->bo) { > > > > - validation_entry->relocation_count = batch->reloc_count; > > > > - validation_entry->relocs_ptr = (uintptr_t) batch->relocs; > > > > - } else { > > > > - validation_entry->relocation_count = 0; > > > > - validation_entry->relocs_ptr = 0; > > > > - } > > > > - validation_entry->alignment = bo->align; > > > > - validation_entry->offset = bo->offset64; > > > > - validation_entry->flags = bo->kflags; > > > > - validation_entry->rsvd1 = 0; > > > > - validation_entry->rsvd2 = 0; > > > > + struct drm_i915_gem_exec_object2 *exec = > > > > + memset(>exec_objects[batch->exec_count], 0, > > > > sizeof(*exec)); > > > > + exec->handle = bo->gem_handle; > > > > + exec->alignment = bo->align; > > > > + exec->offset = bo->offset64; > > > > + exec->flags = bo->kflags; > > > > > > I liked the name "validation_entry" given that we call this the > > > "validation > > > list"...exec matches the struct name better, but I think validation_entry > > > helps distinguish the two lists... > > > > Hmm, how about > > > > - struct drm_i915_gem_exec_object2 *exec = > > - memset(>exec_objects[batch->exec_count], 0, sizeof(*exec)); > > - exec->handle = bo->gem_handle; > > - exec->alignment = bo->align; > > - exec->offset = bo->offset64; > > - exec->flags = bo->kflags; > > + batch->exec_objects[batch->exec_count] = (struct > > drm_i915_gem_exec_object2){ > > + .handle = bo->gem_handle, > > + .alignment = bo->align, > > + .offset = bo->offset64, > > + .flags = bo->kflags, > > + }; > > > > and skip the impossible problem of naming? > > > > But we still end up with a couple of > > struct drm_i915_gem_exec_object2 * > > validation_entry = >exec_objects[index]; > > Could I just call those exec_object? > > -Chris > > I'm not objecting too strongly, call it exec or exec_object if you like. > The initializer use is pretty nice. > > "validation list" is a bit of a weird name anyway... As you've seen, I think there's some merit to a distinct name so we don't get confused with exec_bos, I've settled for struct drm_i915_gem_exec_object2 *entry = >validation_list[index]; as that fits into 80cols :) -Chris ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [RFC PATCH] radeonsi: set a per-buffer flag that disables inter-process sharing (v2)
Am 20.07.2017 um 16:59 schrieb Marek Olšák: On Jul 19, 2017 10:21 PM, "zhoucm1"> wrote: On 2017年07月19日 23:34, Marek Olšák wrote: On Jul 19, 2017 3:36 AM, "zhoucm1" > wrote: On 2017年07月19日 04:08, Marek Olšák wrote: From: Marek Olšák > For lower overhead in the CS ioctl. Winsys allocators are not used with interprocess-sharable resources. Hi Marek, Could I know from how your this way reduces overhead in CS ioctl? reusing BO to short bo list? The kernel part of the work hasn't been done yet. The idea is that nonsharable buffers don't have to be revalidated by TTM, OK, Maybe I only can see the whole picture of this idea when you complete kernel part. Out of curious, why/how can nonsharable buffers be revalidated by TTM without exposing like amdgpu_bo_make_resident api? I think the idea is that all nonsharable buffers will be backed by the same reservation object, so TTM can skip buffer validation if no buffer has been moved. It's just an optimization for the current design. With mentioned in another thread, if we can expose make_resident api, we can remove bo_list, even we can remove reservation operation in CS ioctl. Actually that is NOT a resident api. E.g. in other words BOs marked as this are just swapped out like other BOs. It's just that on command submission additionally to the BOs which come from the BO list we have the BOs associated to the process as well. And now, I think our bo list is a very bad design, first, umd must create bo list for every command submission, this is a extra cpu overhead compared with traditional way. second, kernel also have to iterate the list, when bo list is too long, like OpenCL program, they always throw several thousands BOs to bo list, reservation must keep these thousands ww_mutex safe, CPU overhead is too big. So I strongly suggest we should expose make_resident api to user space. if cannot, I want to know any specific reason to see if we can solve it. Yeah, I think the BO list idea is likely to die sooner or later. It made sense for GL before bindless was a thing. Nowadays I don't see much value in it. Completely agree. The BO list API was not a good idea in the first place. Regards, Christian. MesaGL will keep tracking the BO list because it's a requirement for good GL performance (it determines whether to flush IBs before BO synchronization, it allows tracking fences for each BO, which are used to determine dependencies between IBs, and that all allows async SDMA and async compute for GL, which doesn't have separate queues). However, we don't need any BO list at the libdrm level and lower. I think a BO_CREATE flag that causes that the buffer is added to a kernel-side per-fd BO list would be sufficient. How the kernel manages its BO list should be its own implementation detail. Initially we can just move the current BO list management into the kernel. Marek Regards, David Zhou so it should remove a lot of kernel overhead and the BO list remains the same. Marek Thanks, David Zhou v2: It shouldn't crash anymore, but the kernel will reject the new flag. --- src/gallium/drivers/radeon/r600_buffer_common.c | 7 + src/gallium/drivers/radeon/radeon_winsys.h | 20 +++--- src/gallium/winsys/amdgpu/drm/amdgpu_bo.c | 36 - src/gallium/winsys/radeon/drm/radeon_drm_bo.c | 27 +++ 4 files changed, 62 insertions(+), 28 deletions(-) diff --git a/src/gallium/drivers/radeon/r600_buffer_common.c b/src/gallium/drivers/radeon/r600_buffer_common.c index dd1c209..2747ac4 100644 --- a/src/gallium/drivers/radeon/r600_buffer_common.c +++ b/src/gallium/drivers/radeon/r600_buffer_common.c @@ -160,20 +160,27 @@ void r600_init_resource_fields(struct r600_common_screen *rscreen, } /* Tiled textures are unmappable. Always put them in VRAM. */ if ((res->b.b.target != PIPE_BUFFER && !rtex->surface.is_linear) || res->flags & R600_RESOURCE_FLAG_UNMAPPABLE) { res->domains = RADEON_DOMAIN_VRAM; res->flags |= RADEON_FLAG_NO_CPU_ACCESS | RADEON_FLAG_GTT_WC; } + /* Only displayable single-sample textures can be shared between +*
[Mesa-dev] [Bug 101843] Latest mesa git fails to compile in mesa/main/marshal.c
https://bugs.freedesktop.org/show_bug.cgi?id=101843 --- Comment #4 from Fabian Maurer--- Well, now that I set it up again, it suddenly works. Maybe a configuration problem, but I really can't see what I changed. Thank you for testing, is there an option to resolve this as WORKSFORME? -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] configure.ac: rework wayland-protocols handling
On 20 July 2017 at 18:27, Emil Velikovwrote: > From: Emil Velikov > > At dist/distcheck time we need to ensure that all the files and their > respective dependencies are handled. > > At the moment we'll bail out as the linux-dmabuf rules are guarded in a > conditional. Move them outside of it and drop the sources from > BUILT_SOURCES. > > Thus the files will be generated only as needed, which will happen only > after the wayland-protocols dependency is enforced in configure.ac. > Scratch this patch - make distcheck showed green light, it's not complete yet. -Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH V2] i965 : Optimize atom state flag checks
aravindan.muthuku...@intel.com writes: > From: Aravindan Muthukumar> > This patch improves CPI Rate(Cycles per Instruction) > and branch mispredict for i965. The function check_state() > was showing CPI retired rate. > > Performance stats with android: > CPI retired lowered by 28% (lower is better) > Branch missprediction lowered by 13% (lower is better) > 3DMark improved by 2% > > The dissassembly doesn't show difference, although above > results were observed with patch. > How did you determine that your results are not just statistical noise? Did you do some sort of significance testing? Which test, significance level and sample size did you use? What CPU did you get the numbers on? Thank you. > Signed-off-by: Aravindan Muthukumar > Signedd-off-by: Yogesh Marathe > Tested-by: Asish > --- > > Changes since V1: > - Removed memset() change > - Changed commit message as per review comments > > src/mesa/drivers/dri/i965/brw_defines.h | 4 > src/mesa/drivers/dri/i965/brw_state_upload.c | 12 > 2 files changed, 12 insertions(+), 4 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_defines.h > b/src/mesa/drivers/dri/i965/brw_defines.h > index 2a8dbf8..8c9a510 100644 > --- a/src/mesa/drivers/dri/i965/brw_defines.h > +++ b/src/mesa/drivers/dri/i965/brw_defines.h > @@ -1687,3 +1687,7 @@ enum brw_pixel_shader_coverage_mask_mode { > # define CSDBG2_CONSTANT_BUFFER_ADDRESS_OFFSET_DISABLE (1 << 4) > > #endif > + > +/* Checking the state of mesa and brw before emitting atoms */ > +#define CHECK_BRW_STATE(a,b) ((a.mesa & b.mesa) | (a.brw & b.brw)) > + > diff --git a/src/mesa/drivers/dri/i965/brw_state_upload.c > b/src/mesa/drivers/dri/i965/brw_state_upload.c > index acaa97e..1c8b969 100644 > --- a/src/mesa/drivers/dri/i965/brw_state_upload.c > +++ b/src/mesa/drivers/dri/i965/brw_state_upload.c > @@ -443,10 +443,8 @@ check_and_emit_atom(struct brw_context *brw, > struct brw_state_flags *state, > const struct brw_tracked_state *atom) > { > - if (check_state(state, >dirty)) { >atom->emit(brw); >merge_ctx_state(brw, state); > - } > } > > static inline void > @@ -541,7 +539,10 @@ brw_upload_pipeline_state(struct brw_context *brw, >const struct brw_tracked_state *atom = [i]; >struct brw_state_flags generated; > > - check_and_emit_atom(brw, , atom); > + /* Checking the state and emitting atoms */ > + if (CHECK_BRW_STATE(state, atom->dirty)) { > +check_and_emit_atom(brw, , atom); > + } > >accumulate_state(, >dirty); > > @@ -558,7 +559,10 @@ brw_upload_pipeline_state(struct brw_context *brw, >for (i = 0; i < num_atoms; i++) { >const struct brw_tracked_state *atom = [i]; > > - check_and_emit_atom(brw, , atom); > + /* Checking the state and emitting atoms */ > + if (CHECK_BRW_STATE(state, atom->dirty)) { > +check_and_emit_atom(brw, , atom); > + } >} > } > > -- > 2.7.4 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 09/32] i965/miptree: Add support for partially resolving MCS
On Thu, Jul 20, 2017 at 3:00 AM, Pohjolainen, Topi < topi.pohjolai...@gmail.com> wrote: > On Wed, Jul 19, 2017 at 02:01:35PM -0700, Jason Ekstrand wrote: > > --- > > src/mesa/drivers/dri/i965/brw_blorp.c | 24 > > src/mesa/drivers/dri/i965/brw_blorp.h | 5 > > src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 40 > +-- > > 3 files changed, 67 insertions(+), 2 deletions(-) > > > > diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c > b/src/mesa/drivers/dri/i965/brw_blorp.c > > index efa3b39..ac47f31 100644 > > --- a/src/mesa/drivers/dri/i965/brw_blorp.c > > +++ b/src/mesa/drivers/dri/i965/brw_blorp.c > > @@ -1042,6 +1042,30 @@ brw_blorp_resolve_color(struct brw_context *brw, > struct intel_mipmap_tree *mt, > > brw_emit_end_of_pipe_sync(brw, PIPE_CONTROL_RENDER_TARGET_FLUSH); > > } > > > > +void > > +brw_blorp_mcs_partial_resolve(struct brw_context *brw, > > + struct intel_mipmap_tree *mt, > > + uint32_t start_layer, uint32_t num_layers) > > +{ > > + DBG("%s to mt %p layers %u-%u\n", __FUNCTION__, mt, > > + start_layer, start_layer + num_layers - 1); > > + > > + const mesa_format format = _mesa_get_srgb_format_linear(mt->format); > > + enum isl_format isl_format = brw_blorp_to_isl_format(brw, format, > true); > > + > > + struct isl_surf isl_tmp[1]; > > + struct blorp_surf surf; > > + uint32_t level = 0; > > + blorp_surf_for_miptree(brw, , mt, true, false, 0, > > + , start_layer, num_layers, isl_tmp); > > + > > + struct blorp_batch batch; > > + blorp_batch_init(>blorp, , brw, 0); > > + blorp_mcs_partial_resolve(, , isl_format, > > + start_layer, num_layers); > > + blorp_batch_finish(); > > +} > > + > > /** > > * Perform a HiZ or depth resolve operation. > > * > > diff --git a/src/mesa/drivers/dri/i965/brw_blorp.h > b/src/mesa/drivers/dri/i965/brw_blorp.h > > index 29d5788..c65a68a 100644 > > --- a/src/mesa/drivers/dri/i965/brw_blorp.h > > +++ b/src/mesa/drivers/dri/i965/brw_blorp.h > > @@ -74,6 +74,11 @@ brw_blorp_resolve_color(struct brw_context *brw, > > enum blorp_fast_clear_op resolve_op); > > > > void > > +brw_blorp_mcs_partial_resolve(struct brw_context *brw, > > + struct intel_mipmap_tree *mt, > > + uint32_t start_layer, uint32_t > num_layers); > > + > > +void > > intel_hiz_exec(struct brw_context *brw, struct intel_mipmap_tree *mt, > > unsigned int level, unsigned int start_layer, > > unsigned int num_layers, enum blorp_hiz_op op); > > diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c > b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c > > index 2521190..1fd39a1 100644 > > --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c > > +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c > > @@ -2323,6 +2323,35 @@ intel_miptree_finish_ccs_write(struct > brw_context *brw, > > } > > > > static void > > +intel_miptree_prepare_mcs_access(struct brw_context *brw, > > + struct intel_mipmap_tree *mt, > > + uint32_t layer, > > + bool mcs_supported, > > + bool fast_clear_supported) > > +{ > > + switch (intel_miptree_get_aux_state(mt, 0, layer)) { > > + case ISL_AUX_STATE_CLEAR: > > + case ISL_AUX_STATE_COMPRESSED_CLEAR: > > + assert(mcs_supported); > > + if (!fast_clear_supported) { > > + brw_blorp_mcs_partial_resolve(brw, mt, layer, 1); > > + intel_miptree_set_aux_state(brw, mt, 0, layer, 1, > > + ISL_AUX_STATE_COMPRESSED_NO_ > CLEAR); > > + } > > + break; > > + > > + case ISL_AUX_STATE_COMPRESSED_NO_CLEAR: > > + assert(mcs_supported); > > + break; /* Nothing to do */ > > + > > + case ISL_AUX_STATE_RESOLVED: > > + case ISL_AUX_STATE_PASS_THROUGH: > > + case ISL_AUX_STATE_AUX_INVALID: > > + unreachable("Invalid aux state for MCS"); > > + } > > +} > > + > > +static void > > intel_miptree_finish_mcs_write(struct brw_context *brw, > > struct intel_mipmap_tree *mt, > > uint32_t layer, > > @@ -2336,10 +2365,10 @@ intel_miptree_finish_mcs_write(struct > brw_context *brw, > >break; > > > > case ISL_AUX_STATE_COMPRESSED_CLEAR: > > + case ISL_AUX_STATE_COMPRESSED_NO_CLEAR: > >assert(written_with_mcs); > >break; /* Nothing to do */ > > > > - case ISL_AUX_STATE_COMPRESSED_NO_CLEAR: > > case ISL_AUX_STATE_RESOLVED: > > case ISL_AUX_STATE_PASS_THROUGH: > > case ISL_AUX_STATE_AUX_INVALID: > > @@ -2499,7 +2528,14 @@ intel_miptree_prepare_access(struct brw_context > *brw, > > > >if (mt->num_samples > 1) { > > /* Nothing to do for MSAA */ > > We should
[Mesa-dev] [PATCH] configure.ac: rework wayland-protocols handling
From: Emil VelikovAt dist/distcheck time we need to ensure that all the files and their respective dependencies are handled. At the moment we'll bail out as the linux-dmabuf rules are guarded in a conditional. Move them outside of it and drop the sources from BUILT_SOURCES. Thus the files will be generated only as needed, which will happen only after the wayland-protocols dependency is enforced in configure.ac. Cc: Andres Gomez Signed-off-by: Emil Velikov --- configure.ac| 13 ++--- src/egl/Makefile.am | 12 +--- 2 files changed, 15 insertions(+), 10 deletions(-) diff --git a/configure.ac b/configure.ac index 2689fc55e85..2736fbf201b 100644 --- a/configure.ac +++ b/configure.ac @@ -1681,19 +1681,26 @@ if test "x$WAYLAND_SCANNER" = x; then AC_PATH_PROG([WAYLAND_SCANNER], [wayland-scanner], [:]) fi +PKG_CHECK_EXISTS([wayland-protocols >= $WAYLAND_PROTOCOLS_REQUIRED], [have_wayland_protocols=yes], [have_wayland_protocols=no]) +if test "x$have_wayland_protocols" = xyes; then +ac_wayland_protocols_pkgdatadir=`$PKG_CONFIG --variable=pkgdatadir wayland-protocols` +fi +AC_SUBST(WAYLAND_PROTOCOLS_DATADIR, $ac_wayland_protocols_pkgdatadir) + # Do per platform setups and checks platforms=`IFS=', '; echo $with_platforms` for plat in $platforms; do case "$plat" in wayland) -PKG_CHECK_MODULES([WAYLAND], [wayland-client >= $WAYLAND_REQUIRED wayland-server >= $WAYLAND_REQUIRED wayland-protocols >= $WAYLAND_PROTOCOLS_REQUIRED]) -ac_wayland_protocols_pkgdatadir=`$PKG_CONFIG --variable=pkgdatadir wayland-protocols` -AC_SUBST(WAYLAND_PROTOCOLS_DATADIR, $ac_wayland_protocols_pkgdatadir) +PKG_CHECK_MODULES([WAYLAND], [wayland-client >= $WAYLAND_REQUIRED wayland-server >= $WAYLAND_REQUIRED]) if test "x$WAYLAND_SCANNER" = "x:"; then AC_MSG_ERROR([wayland-scanner is needed to compile the wayland platform]) fi +if test "x$have_wayland_protocols" = xno; then +AC_MSG_ERROR([wayland-protocols >= $WAYLAND_PROTOCOLS_REQUIRED is needed to compile the wayland platform]) +fi DEFINES="$DEFINES -DHAVE_WAYLAND_PLATFORM" ;; diff --git a/src/egl/Makefile.am b/src/egl/Makefile.am index 7c1a4929b81..4094639284e 100644 --- a/src/egl/Makefile.am +++ b/src/egl/Makefile.am @@ -64,7 +64,6 @@ libEGL_common_la_LIBADD += $(top_builddir)/src/loader/libloader_dri3_helper.la endif endif -if HAVE_PLATFORM_WAYLAND WL_DMABUF_XML = $(WAYLAND_PROTOCOLS_DATADIR)/unstable/linux-dmabuf/linux-dmabuf-unstable-v1.xml drivers/dri2/linux-dmabuf-unstable-v1-protocol.c: $(WL_DMABUF_XML) @@ -75,17 +74,16 @@ drivers/dri2/linux-dmabuf-unstable-v1-client-protocol.h: $(WL_DMABUF_XML) $(MKDIR_GEN) $(AM_V_GEN)$(WAYLAND_SCANNER) client-header < $< > $@ -BUILT_SOURCES += \ - drivers/dri2/linux-dmabuf-unstable-v1-protocol.c \ - drivers/dri2/linux-dmabuf-unstable-v1-client-protocol.h - +if HAVE_PLATFORM_WAYLAND AM_CFLAGS += $(WAYLAND_CFLAGS) libEGL_common_la_LIBADD += $(WAYLAND_LIBS) libEGL_common_la_LIBADD += $(LIBDRM_LIBS) libEGL_common_la_LIBADD += $(top_builddir)/src/egl/wayland/wayland-drm/libwayland-drm.la libEGL_common_la_LIBADD += $(top_builddir)/src/util/libmesautil.la -dri2_backend_FILES += drivers/dri2/platform_wayland.c \ - drivers/dri2/linux-dmabuf-unstable-v1-protocol.c +dri2_backend_FILES += \ + drivers/dri2/platform_wayland.c \ + drivers/dri2/linux-dmabuf-unstable-v1-protocol.c \ + drivers/dri2/linux-dmabuf-unstable-v1-client-protocol.h endif if HAVE_PLATFORM_DRM -- 2.12.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 03/32] i965/blorp: Do flushes around depth resolves
On Thu, Jul 20, 2017 at 2:38 AM, Pohjolainen, Topi < topi.pohjolai...@gmail.com> wrote: > On Wed, Jul 19, 2017 at 02:01:29PM -0700, Jason Ekstrand wrote: > > It turns out that if you have rendering in-flight with CCS_E enabled and > > you go to do a depth resolve without flushing, the CCS data may never > > hit the memory. > > --- > > src/mesa/drivers/dri/i965/brw_blorp.c | 150 > -- > > 1 file changed, 72 insertions(+), 78 deletions(-) > > > > diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c > b/src/mesa/drivers/dri/i965/brw_blorp.c > > index 5335fae..efa3b39 100644 > > --- a/src/mesa/drivers/dri/i965/brw_blorp.c > > +++ b/src/mesa/drivers/dri/i965/brw_blorp.c > > @@ -1079,51 +1079,48 @@ intel_hiz_exec(struct brw_context *brw, struct > intel_mipmap_tree *mt, > > __func__, opname, mt, level, start_layer, start_layer + > num_layers - 1); > > > > /* The following stalls and flushes are only documented to be > required for > > -* HiZ clear operations. However, they also seem to be required for > the > > -* HiZ resolve operation which is basically the same as a fast clear > only a > > -* different value is written into the HiZ surface. > > +* HiZ clear operations. However, they also seem to be required for > > +* resolve operations. > > How would feel putting some of the rational in the commit message here? > Sounds > valuable. > Hrm... I can, but I think the problem is most likely more general than CCS_E. The fact that CCS_E happens to show it off doesn't mean that's why we need to do it. > > */ > > - if (op == BLORP_HIZ_OP_DEPTH_CLEAR || op == > BLORP_HIZ_OP_HIZ_RESOLVE) { > > - if (brw->gen == 6) { > > - /* From the Sandy Bridge PRM, volume 2 part 1, page 313: > > - * > > - * "If other rendering operations have preceded this clear, a > > - * PIPE_CONTROL with write cache flush enabled and Z-inhibit > > - * disabled must be issued before the rectangle primitive > used for > > - * the depth buffer clear operation. > > - */ > > - brw_emit_pipe_control_flush(brw, > > - PIPE_CONTROL_RENDER_TARGET_FLUSH > | > > - PIPE_CONTROL_DEPTH_CACHE_FLUSH | > > - PIPE_CONTROL_CS_STALL); > > - } else if (brw->gen >= 7) { > > - /* > > - * From the Ivybridge PRM, volume 2, "Depth Buffer Clear": > > - * > > - * If other rendering operations have preceded this clear, a > > - * PIPE_CONTROL with depth cache flush enabled, Depth Stall > bit > > - * enabled must be issued before the rectangle primitive > used for > > - * the depth buffer clear operation. > > - * > > - * Same applies for Gen8 and Gen9. > > - * > > - * In addition, from the Ivybridge PRM, volume 2, 1.10.4.1 > > - * PIPE_CONTROL, Depth Cache Flush Enable: > > - * > > - * This bit must not be set when Depth Stall Enable bit is > set in > > - * this packet. > > - * > > - * This is confirmed to hold for real, HSW gets immediate gpu > hangs. > > - * > > - * Therefore issue two pipe control flushes, one for cache > flush and > > - * another for depth stall. > > - */ > > - brw_emit_pipe_control_flush(brw, > > - PIPE_CONTROL_DEPTH_CACHE_FLUSH | > > - PIPE_CONTROL_CS_STALL); > > + if (brw->gen == 6) { > > + /* From the Sandy Bridge PRM, volume 2 part 1, page 313: > > + * > > + * "If other rendering operations have preceded this clear, a > > + * PIPE_CONTROL with write cache flush enabled and Z-inhibit > > + * disabled must be issued before the rectangle primitive used > for > > + * the depth buffer clear operation. > > + */ > > + brw_emit_pipe_control_flush(brw, > > + PIPE_CONTROL_RENDER_TARGET_FLUSH | > > + PIPE_CONTROL_DEPTH_CACHE_FLUSH | > > + PIPE_CONTROL_CS_STALL); > > + } else if (brw->gen >= 7) { > > + /* > > + * From the Ivybridge PRM, volume 2, "Depth Buffer Clear": > > + * > > + * If other rendering operations have preceded this clear, a > > + * PIPE_CONTROL with depth cache flush enabled, Depth Stall bit > > + * enabled must be issued before the rectangle primitive used > for > > + * the depth buffer clear operation. > > + * > > + * Same applies for Gen8 and Gen9. > > + * > > + * In addition, from the Ivybridge PRM, volume 2, 1.10.4.1 > > + * PIPE_CONTROL, Depth Cache Flush Enable: > > + * > > + * This bit must not be set when Depth Stall Enable bit is set > in >
Re: [Mesa-dev] [PATCH] dri: Make classic drivers allow __DRI_CTX_FLAG_NO_ERROR.
On 2017-07-18 20:25, Ian Romanick wrote: On 07/14/2017 04:10 PM, Kenneth Graunke wrote: Grigori recently added EGL_KHR_create_context_no_error support, which causes EGL to pass a new __DRI_CTX_FLAG_NO_ERROR flag to drivers when requesting an appropriate context mode. driContextSetFlags() will already handle it properly for us, but the classic drivers all have code to explicitly balk at unknown flags. We need to let it through or they'll fail to create a no_error context. I'm almost afraid to ask... are there tests that try to create a no_error context? I have now posted a test to the piglit ML, which might be useful for testing this patch. Grigori --- src/mesa/drivers/dri/i915/intel_screen.c | 2 +- src/mesa/drivers/dri/i965/brw_context.c| 5 +++-- src/mesa/drivers/dri/nouveau/nouveau_context.c | 2 +- src/mesa/drivers/dri/r200/r200_context.c | 2 +- src/mesa/drivers/dri/radeon/radeon_context.c | 2 +- 5 files changed, 7 insertions(+), 6 deletions(-) Drivers other than i965 have not been tested. diff --git a/src/mesa/drivers/dri/i915/intel_screen.c b/src/mesa/drivers/dri/i915/intel_screen.c index 9e23552b998..1ac72e14a15 100644 --- a/src/mesa/drivers/dri/i915/intel_screen.c +++ b/src/mesa/drivers/dri/i915/intel_screen.c @@ -972,7 +972,7 @@ intelCreateContext(gl_api api, __DRIscreen *sPriv = driContextPriv->driScreenPriv; struct intel_screen *intelScreen = sPriv->driverPrivate; - if (flags & ~__DRI_CTX_FLAG_DEBUG) { + if (flags & ~(__DRI_CTX_FLAG_DEBUG | __DRI_CTX_FLAG_NO_ERROR)) { *error = __DRI_CTX_ERROR_UNKNOWN_FLAG; return false; } diff --git a/src/mesa/drivers/dri/i965/brw_context.c b/src/mesa/drivers/dri/i965/brw_context.c index b23e811f305..bd26e2332c7 100644 --- a/src/mesa/drivers/dri/i965/brw_context.c +++ b/src/mesa/drivers/dri/i965/brw_context.c @@ -813,8 +813,9 @@ brwCreateContext(gl_api api, /* Only allow the __DRI_CTX_FLAG_ROBUST_BUFFER_ACCESS flag if the kernel * provides us with context reset notifications. */ - uint32_t allowed_flags = __DRI_CTX_FLAG_DEBUG - | __DRI_CTX_FLAG_FORWARD_COMPATIBLE; + uint32_t allowed_flags = __DRI_CTX_FLAG_DEBUG | +__DRI_CTX_FLAG_FORWARD_COMPATIBLE | +__DRI_CTX_FLAG_NO_ERROR; if (screen->has_context_reset_notification) allowed_flags |= __DRI_CTX_FLAG_ROBUST_BUFFER_ACCESS; diff --git a/src/mesa/drivers/dri/nouveau/nouveau_context.c b/src/mesa/drivers/dri/nouveau/nouveau_context.c index 6ddcadce1f0..d6f9e533848 100644 --- a/src/mesa/drivers/dri/nouveau/nouveau_context.c +++ b/src/mesa/drivers/dri/nouveau/nouveau_context.c @@ -63,7 +63,7 @@ nouveau_context_create(gl_api api, struct nouveau_context *nctx; struct gl_context *ctx; - if (flags & ~__DRI_CTX_FLAG_DEBUG) { + if (flags & ~(__DRI_CTX_FLAG_DEBUG | __DRI_CTX_FLAG_NO_ERROR)) { *error = __DRI_CTX_ERROR_UNKNOWN_FLAG; return false; } diff --git a/src/mesa/drivers/dri/r200/r200_context.c b/src/mesa/drivers/dri/r200/r200_context.c index aaa9b9317df..5a7f33499b1 100644 --- a/src/mesa/drivers/dri/r200/r200_context.c +++ b/src/mesa/drivers/dri/r200/r200_context.c @@ -189,7 +189,7 @@ GLboolean r200CreateContext( gl_api api, int i; int tcl_mode; - if (flags & ~__DRI_CTX_FLAG_DEBUG) { + if (flags & ~(__DRI_CTX_FLAG_DEBUG | __DRI_CTX_FLAG_NO_ERROR)) { *error = __DRI_CTX_ERROR_UNKNOWN_FLAG; return false; } diff --git a/src/mesa/drivers/dri/radeon/radeon_context.c b/src/mesa/drivers/dri/radeon/radeon_context.c index 11afe20c6a0..5ef3467ac17 100644 --- a/src/mesa/drivers/dri/radeon/radeon_context.c +++ b/src/mesa/drivers/dri/radeon/radeon_context.c @@ -155,7 +155,7 @@ r100CreateContext( gl_api api, int i; int tcl_mode, fthrottle_mode; - if (flags & ~__DRI_CTX_FLAG_DEBUG) { + if (flags & ~(__DRI_CTX_FLAG_DEBUG | __DRI_CTX_FLAG_NO_ERROR)) { *error = __DRI_CTX_ERROR_UNKNOWN_FLAG; return false; } ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] mesa: remove pointless assignments in init_teximage_fields_ms()
The NumSamples and FixedSampleLocation fields are set again later at the end of the function so these earlier assignments aren't needed. --- src/mesa/main/teximage.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/src/mesa/main/teximage.c b/src/mesa/main/teximage.c index c30f8ac..d55d9b0 100644 --- a/src/mesa/main/teximage.c +++ b/src/mesa/main/teximage.c @@ -813,9 +813,6 @@ init_teximage_fields_ms(struct gl_context *ctx, img->Width2 = width - 2 * border; /* == 1 << img->WidthLog2; */ img->WidthLog2 = _mesa_logbase2(img->Width2); - img->NumSamples = 0; - img->FixedSampleLocations = GL_TRUE; - switch(target) { case GL_TEXTURE_1D: case GL_TEXTURE_BUFFER: -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] st/mesa: use proper resource target type in st_AllocTextureStorage()
When we validate the texture sample count, pass the correct pipe_texture_target for the texture, rather than PIPE_TEXTURE_2D. Also add more comments about MSAA. --- src/mesa/state_tracker/st_cb_texture.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/src/mesa/state_tracker/st_cb_texture.c b/src/mesa/state_tracker/st_cb_texture.c index c6a5e63..f66e1bd 100644 --- a/src/mesa/state_tracker/st_cb_texture.c +++ b/src/mesa/state_tracker/st_cb_texture.c @@ -2651,6 +2651,8 @@ st_finalize_texture(struct gl_context *ctx, /** * Called via ctx->Driver.AllocTextureStorage() to allocate texture memory * for a whole mipmap stack. + * Note: for multisample textures if the requested sample count is not + * supported, we search for the next higher supported sample count. */ static GLboolean st_AllocTextureStorage(struct gl_context *ctx, @@ -2679,10 +2681,11 @@ st_AllocTextureStorage(struct gl_context *ctx, /* Raise the sample count if the requested one is unsupported. */ if (num_samples > 1) { + enum pipe_texture_target ptarget = gl_target_to_pipe(texObj->Target); boolean found = FALSE; for (; num_samples <= ctx->Const.MaxSamples; num_samples++) { - if (screen->is_format_supported(screen, fmt, PIPE_TEXTURE_2D, + if (screen->is_format_supported(screen, fmt, ptarget, num_samples, PIPE_BIND_SAMPLER_VIEW)) { /* Update the sample count in gl_texture_image as well. */ -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] svga: add more checking of sample_count in svga_is_format_supported()
We're not supporting 2x MSAA, for example. --- src/gallium/drivers/svga/svga_screen.c | 8 1 file changed, 8 insertions(+) diff --git a/src/gallium/drivers/svga/svga_screen.c b/src/gallium/drivers/svga/svga_screen.c index 1ec91e5..0d8e59d 100644 --- a/src/gallium/drivers/svga/svga_screen.c +++ b/src/gallium/drivers/svga/svga_screen.c @@ -737,6 +737,14 @@ svga_is_format_supported( struct pipe_screen *screen, if ((ss->ms_samples & (1 << (sample_count - 1))) == 0) { return FALSE; } + if (sample_count != 4 && + sample_count != 8 && + sample_count != 16) { + /* Despite what the device supports, we don't support 2 samples, + * for example. See the WGL state tracker code. + */ + return FALSE; + } } svga_format = svga_translate_format(ss, format, bindings); -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Rename batch->exec_objects to validation_list
On Thursday, July 20, 2017 9:29:19 AM PDT Chris Wilson wrote: > Within i965, we have many different objects and confusingly when > submitting an execbuf we have lists of both our internal objects and a > list of the kernel's drm_i915_gem_exec_object with very similar names. > Rename the kernel's validation list to avoid the collison as it is only > used for interfacing with the kernel and so a peripheral use of > "object". > > Cc: Kenneth GraunkeOh :( Now I realize my comment about "we call the list validation_list" must have made no sense. Sorry about that. I thought I called it that. Reviewed-by: Kenneth Graunke signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 08/13] i965: Convert reloc.target_handle into an index for I915_EXEC_HANDLE_LUT
On Thursday, July 20, 2017 8:05:19 AM PDT Chris Wilson wrote: > Quoting Kenneth Graunke (2017-07-19 23:36:58) > > On Wednesday, July 19, 2017 3:09:16 AM PDT Chris Wilson wrote: > > > #define READ_ONCE(x) (*(volatile __typeof__(x) *)&(x)) > > > @@ -117,21 +125,12 @@ add_exec_bo(struct intel_batchbuffer *batch, struct > > > brw_bo *bo) > > > batch->exec_array_size * > > > sizeof(batch->exec_objects[0])); > > > } > > > > > > - struct drm_i915_gem_exec_object2 *validation_entry = > > > - >exec_objects[batch->exec_count]; > > > - validation_entry->handle = bo->gem_handle; > > > - if (bo == batch->bo) { > > > - validation_entry->relocation_count = batch->reloc_count; > > > - validation_entry->relocs_ptr = (uintptr_t) batch->relocs; > > > - } else { > > > - validation_entry->relocation_count = 0; > > > - validation_entry->relocs_ptr = 0; > > > - } > > > - validation_entry->alignment = bo->align; > > > - validation_entry->offset = bo->offset64; > > > - validation_entry->flags = bo->kflags; > > > - validation_entry->rsvd1 = 0; > > > - validation_entry->rsvd2 = 0; > > > + struct drm_i915_gem_exec_object2 *exec = > > > + memset(>exec_objects[batch->exec_count], 0, sizeof(*exec)); > > > + exec->handle = bo->gem_handle; > > > + exec->alignment = bo->align; > > > + exec->offset = bo->offset64; > > > + exec->flags = bo->kflags; > > > > I liked the name "validation_entry" given that we call this the "validation > > list"...exec matches the struct name better, but I think validation_entry > > helps distinguish the two lists... > > Hmm, how about > > - struct drm_i915_gem_exec_object2 *exec = > - memset(>exec_objects[batch->exec_count], 0, sizeof(*exec)); > - exec->handle = bo->gem_handle; > - exec->alignment = bo->align; > - exec->offset = bo->offset64; > - exec->flags = bo->kflags; > + batch->exec_objects[batch->exec_count] = (struct > drm_i915_gem_exec_object2){ > + .handle = bo->gem_handle, > + .alignment = bo->align, > + .offset = bo->offset64, > + .flags = bo->kflags, > + }; > > and skip the impossible problem of naming? > > But we still end up with a couple of > struct drm_i915_gem_exec_object2 * > validation_entry = >exec_objects[index]; > Could I just call those exec_object? > -Chris I'm not objecting too strongly, call it exec or exec_object if you like. The initializer use is pretty nice. "validation list" is a bit of a weird name anyway... --Ken signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] egl: add linux-dmabuf-unstable-v1-protocol.c to "nodist"
On 20 July 2017 at 15:42, Andres Gomezwrote: > On Thu, 2017-07-20 at 14:37 +0100, Emil Velikov wrote: >> On 20 July 2017 at 13:54, Daniel Stone wrote: >> > Hi Emil, >> > >> > On 20 July 2017 at 13:51, Emil Velikov wrote: >> > > On 19 July 2017 at 23:44, Andres Gomez wrote: >> > > > -dri2_backend_FILES += drivers/dri2/platform_wayland.c \ >> > > > - drivers/dri2/linux-dmabuf-unstable-v1-protocol.c >> > > > +dri2_backend_FILES += drivers/dri2/platform_wayland.c >> > > > +nodist_dri2_backend_FILES += >> > > > drivers/dri2/linux-dmabuf-unstable-v1-protocol.c >> > > > endif >> > > > >> > > > if HAVE_PLATFORM_DRM >> > > > @@ -119,6 +122,9 @@ libEGL_common_la_SOURCES += \ >> > > > $(dri2_backend_FILES) \ >> > > > $(dri3_backend_FILES) >> > > > >> > > > +nodist_libEGL_common_la_SOURCES += \ >> > > > + $(nodist_dri2_backend_FILES) >> > > > + >> > > >> > > Just add the files two generated file to BUILT_SOURCES. That's the way >> > > we handle it through the tree. >> > >> > They're already in BUILT_SOURCES (line 78 in master), so I guess >> > something else is wrong. >> > >> >> Thanks for the correction Dan. >> >> Upon a second look - BUILT_SOURCES line is in a conditional which is >> causing the issue. Moving it outside solves the problem on my system. > > Mmmm ... the BUILT_SOURCES is in the same conditional in which the > linux-dmabuf-unstable-v1-protocol.c target is define so I don't think > that would be solving the problem and it doesn't seem to be doing so in > my travis: > https://travis-ci.org/Igalia/release-mesa/jobs/255687844 > Moved the HAVE_PLATFORM_WAYLAND guard after the BUILT_SOURCES and checked the generated file. Which seemingly is not enough... I have an idea - let's seen how well it pans out. -Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 02/13] i965: Allow passing target_bo=NULL to brw_emit_reloc()
On Thursday, July 20, 2017 7:29:52 AM PDT Chris Wilson wrote: > Quoting Chris Wilson (2017-07-20 15:15:02) > > Quoting Kenneth Graunke (2017-07-19 21:08:23) > > > On Wednesday, July 19, 2017 3:09:10 AM PDT Chris Wilson wrote: > > > > Sometimes we want to emit a relocation to a NULL surface when the > > > > constructing the batch. If we push the NULL handling into the common > > > > brw_emit_reloc() we can make the batch construction itself more > > > > readable. > > > > > > I don't like this... > > > > > > There is no such thing as a "relocation to a NULL surface". No relocation > > > is emittted in this case. It either means the field is relative to a base > > > address, and is simply an offset, or the address is unused and we're > > > setting > > > a NULL pointer combined with other bits packed into the same DWord. > > > > There's actually no such thing as a relocation ;) I only put it out > > because there were so many duplicated checks. > > The difference is that in the always use brw_emit_reloc patch end up > with > > @@ -661,18 +662,17 @@ brw_emit_buffer_surface_state(struct brw_context *brw, >out_offset); > > isl_buffer_fill_state(>isl_dev, dw, > - .address = (bo ? bo->offset64 : 0) + buffer_offset, > + .address = (bo ? > + brw_emit_reloc(>batch, > +*out_offset + > brw->isl_dev.ss.addr_offset, > +bo, buffer_offset, > +I915_GEM_DOMAIN_SAMPLER, > +(rw ? > I915_GEM_DOMAIN_SAMPLER : 0)) : > + buffer_offset), Not that much of an eyesore if you do... .address = !bo ? buffer_offset : brw_emit_reloc(>batch, *out_offset + brw->isl_dev.ss.addr_offset, bo, buffer_offset, I915_GEM_DOMAIN_SAMPLER, (rw ? I915_GEM_DOMAIN_SAMPLER : 0), one extra line at the top, no extra parens or else case hanging off the end... > .size = buffer_size, > .format = surface_format, > .stride = pitch, > .mocs = tex_mocs[brw->gen]); > > which was a bit too much of an eyesore for me. > -Chris signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] i965: Rename batch->exec_objects to validation_list
Within i965, we have many different objects and confusingly when submitting an execbuf we have lists of both our internal objects and a list of the kernel's drm_i915_gem_exec_object with very similar names. Rename the kernel's validation list to avoid the collison as it is only used for interfacing with the kernel and so a peripheral use of "object". Cc: Kenneth Graunke--- src/mesa/drivers/dri/i965/brw_context.h | 4 +++- src/mesa/drivers/dri/i965/intel_batchbuffer.c | 22 +++--- 2 files changed, 14 insertions(+), 12 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h index ffe4792b73..2acebaa820 100644 --- a/src/mesa/drivers/dri/i965/brw_context.h +++ b/src/mesa/drivers/dri/i965/brw_context.h @@ -458,11 +458,13 @@ struct intel_batchbuffer { struct drm_i915_gem_relocation_entry *relocs; int reloc_count; int reloc_array_size; + /** The validation list */ - struct drm_i915_gem_exec_object2 *exec_objects; + struct drm_i915_gem_exec_object2 *validation_list; struct brw_bo **exec_bos; int exec_count; int exec_array_size; + /** The amount of aperture space (in bytes) used by all exec_bos */ int aperture_space; diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.c b/src/mesa/drivers/dri/i965/intel_batchbuffer.c index 28c2f474c0..4461a59b80 100644 --- a/src/mesa/drivers/dri/i965/intel_batchbuffer.c +++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.c @@ -78,8 +78,8 @@ intel_batchbuffer_init(struct intel_batchbuffer *batch, batch->exec_array_size = 100; batch->exec_bos = malloc(batch->exec_array_size * sizeof(batch->exec_bos[0])); - batch->exec_objects = - malloc(batch->exec_array_size * sizeof(batch->exec_objects[0])); + batch->validation_list = + malloc(batch->exec_array_size * sizeof(batch->validation_list[0])); if (INTEL_DEBUG & DEBUG_BATCH) { batch->state_batch_sizes = @@ -162,7 +162,7 @@ intel_batchbuffer_free(struct intel_batchbuffer *batch) } free(batch->relocs); free(batch->exec_bos); - free(batch->exec_objects); + free(batch->validation_list); brw_bo_unreference(batch->last_bo); brw_bo_unreference(batch->bo); @@ -532,13 +532,13 @@ add_exec_bo(struct intel_batchbuffer *batch, struct brw_bo *bo) batch->exec_bos = realloc(batch->exec_bos, batch->exec_array_size * sizeof(batch->exec_bos[0])); - batch->exec_objects = - realloc(batch->exec_objects, - batch->exec_array_size * sizeof(batch->exec_objects[0])); + batch->validation_list = + realloc(batch->validation_list, + batch->exec_array_size * sizeof(batch->validation_list[0])); } struct drm_i915_gem_exec_object2 *validation_entry = - >exec_objects[batch->exec_count]; + >validation_list[batch->exec_count]; validation_entry->handle = bo->gem_handle; if (bo == batch->bo) { validation_entry->relocation_count = batch->reloc_count; @@ -568,7 +568,7 @@ execbuffer(int fd, int flags) { struct drm_i915_gem_execbuffer2 execbuf = { - .buffers_ptr = (uintptr_t) batch->exec_objects, + .buffers_ptr = (uintptr_t) batch->validation_list, .buffer_count = batch->exec_count, .batch_start_offset = 0, .batch_len = used, @@ -599,10 +599,10 @@ execbuffer(int fd, bo->idle = false; /* Update brw_bo::offset64 */ - if (batch->exec_objects[i].offset != bo->offset64) { + if (batch->validation_list[i].offset != bo->offset64) { DBG("BO %d migrated: 0x%" PRIx64 " -> 0x%llx\n", - bo->gem_handle, bo->offset64, batch->exec_objects[i].offset); - bo->offset64 = batch->exec_objects[i].offset; + bo->gem_handle, bo->offset64, batch->validation_list[i].offset); + bo->offset64 = batch->validation_list[i].offset; } } -- 2.13.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 07/32] i965/miptree: Make layer_range_length return locical layers
On Thu, Jul 20, 2017 at 8:08 AM, Jason Ekstrandwrote: > On July 20, 2017 2:27:50 AM "Pohjolainen, Topi" < > topi.pohjolai...@gmail.com> wrote: > > >> This still leaves create_aux_state_map() using physical number of layers >> for >> the actual allocation. I toyed a little with this hoping to put it in >> front my >> i965-to-isl work. In the end it looks to me that moving away from physical >> is easier/cleaner once the conversion to isl is done. This prevents us >> from >> addressing both isl-based and native slice table-based leaving just the >> isl-based. Moreover in case of slice table-based we would need to >> calculate >> a layer-number divider as the slice table itself stores depth as physical. >> >> There are altogether 6 places calling get_num_phys_layers() in the end. >> I think these document nicely all the locations we need to consider. How >> does >> this sound? >> > > That's fine with me. > To put a finer point on it, I'm fine with your patch landing as-is (go ahead and add my R-B). However, I am going to have to switch it over to logical as part of my series since the MCS partial resolve code relies on it being logical. > On Wed, Jul 19, 2017 at 02:01:33PM -0700, Jason Ekstrand wrote: >> >>> --- >>> src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 14 ++ >>> 1 file changed, 10 insertions(+), 4 deletions(-) >>> >>> diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c >>> b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c >>> index 3eac077..8d8ea43 100644 >>> --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c >>> +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c >>> @@ -2461,12 +2461,18 @@ miptree_layer_range_length(const struct >>> intel_mipmap_tree *mt, uint32_t level, >>> assert(level <= mt->last_level); >>> uint32_t total_num_layers; >>> >>> - if (mt->surf.size > 0) >>> + if (mt->surf.size > 0) { >>>total_num_layers = mt->surf.dim == ISL_SURF_DIM_3D ? >>> - minify(mt->surf.phys_level0_sa.depth, level) : >>> - mt->surf.phys_level0_sa.array_len; >>> - else >>> + minify(mt->surf.logical_level0_px.depth, level) : >>> + mt->surf.logical_level0_px.array_len; >>> + } else { >>>total_num_layers = mt->level[level].depth; >>> + if (mt->msaa_layout == INTEL_MSAA_LAYOUT_UMS || >>> + mt->msaa_layout == INTEL_MSAA_LAYOUT_CMS) { >>> + assert(total_num_layers % mt->num_samples == 0); >>> + total_num_layers /= mt->num_samples; >>> + } >>> + } >>> >>> assert(start_layer < total_num_layers); >>> if (num_layers == INTEL_REMAINING_LAYERS) >>> -- >>> 2.5.0.400.gff86faf >>> >>> ___ >>> mesa-dev mailing list >>> mesa-dev@lists.freedesktop.org >>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev >>> >> > > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 09/13] i965: Always create the batch with the batch object in the first execobject slot
Quoting Kenneth Graunke (2017-07-19 23:43:04) > On Wednesday, July 19, 2017 3:09:17 AM PDT Chris Wilson wrote: > > Even if we are using older kernels that do not accept the batch in the > > first slot, we can simplify our code by creating the batch with itself > > in the first slot and moving it to the end on execbuf submission. > > --- > > src/mesa/drivers/dri/i965/intel_batchbuffer.c | 70 > > --- > > 1 file changed, 31 insertions(+), 39 deletions(-) > > Alternatively, instead of swapping them out, we could simply add_exec_bo the > batch at the end, and in execbuffer() do: > > if (!use_batch_first) { >execbuf.buffers_ptr++; >execbuf.buffers_count--; > } > > to skip over the batchbuffer entry at the beginning. That seems easier... To do that neatly I thought I would need to break apart execbuffer(). Something like, diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.c b/src/mesa/drivers/dri/i965/intel_batchbuffer.c index f88e000b71..729f411be2 100644 --- a/src/mesa/drivers/dri/i965/intel_batchbuffer.c +++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.c @@ -645,8 +645,14 @@ do_flush_locked(struct brw_context *brw, int in_fence_fd, int *out_fence_fd) } } - if (!brw->screen->no_hw) { - unsigned int flags; + if (ret == 0 && !brw->screen->no_hw) { + struct drm_i915_gem_execbuffer2 execbuf = { + .buffers_ptr = (uintptr_t) batch->exec_objects, + .buffer_count = batch->exec_count, + .batch_len = used, + /* rsvd1 is actually the context ID */ + .rsvd1 = batch->ring == RENDER_RING ? brw->hw_ctx : 0; + }; /* The requirement for using I915_EXEC_NO_RELOC are: * @@ -660,37 +666,63 @@ do_flush_locked(struct brw_context *brw, int in_fence_fd, int *out_fence_fd) * To avoid stalling, execobject.offset should match the current * address of that object within the active context. */ - flags = I915_EXEC_NO_RELOC; + execbuf.flags = I915_EXEC_NO_RELOC; if (brw->gen >= 6 && batch->ring == BLT_RING) { - flags |= I915_EXEC_BLT; + execbuf.flags |= I915_EXEC_BLT; } else { - flags |= I915_EXEC_RENDER; + execbuf.flags |= I915_EXEC_RENDER; } if (batch->needs_sol_reset) -flags |= I915_EXEC_GEN7_SOL_RESET; +execbuf.flags |= I915_EXEC_GEN7_SOL_RESET; - struct drm_i915_gem_exec_object2 *exec_object = >exec_objects[0]; + unsigned int index; + if (batch->use_exec_lut) { + execbuf.flags |= I915_EXEC_BATCH_FIRST | I915_EXEC_HANDLE_LUT; + index = 0; + } else { + index = add_exec_bo(batch, target); + execbuf.buffers_ptr = (uintptr_t) (batch->exec_objects + 1); + } + struct drm_i915_gem_exec_object2 *exec_object = + >exec_objects[index]; assert(exec_object->handle == batch->bo->gem_handle); exec_object->relocation_count = batch->reloc_count; exec_object->relocs_ptr = (uintptr_t) batch->relocs; - if (batch->use_exec_lut) { - flags |= I915_EXEC_BATCH_FIRST | I915_EXEC_HANDLE_LUT; - } else { - struct drm_i915_gem_exec_object2 tmp = *exec_object; - unsigned int index = batch->exec_count - 1; - *exec_object = batch->exec_objects[index]; - batch->exec_objects[index] = tmp; + + unsigned long cmd = DRM_IOCTL_I915_GEM_EXECBUFFER2; + + if (in_fence_fd != -1) { + execbuf.rsvd2 = in_fence; + execbuf.flags |= I915_EXEC_FENCE_IN; } - if (ret == 0) { - uint32_t hw_ctx = batch->ring == RENDER_RING ? brw->hw_ctx : 0; + if (out_fence_fd != NULL) { + cmd = DRM_IOCTL_I915_GEM_EXECBUFFER2_WR; + *out_fence_fd = -1; + execbuf.flags |= I915_EXEC_FENCE_OUT; + } + + if (drmIoctl(fd, cmd, )) + ret = -errno; + + for (int i = 0; i < batch->exec_count; i++) { + struct brw_bo *bo = batch->exec_bos[i]; - ret = execbuffer(dri_screen->fd, batch, hw_ctx, - 4 * USED_BATCH(*batch), - in_fence_fd, out_fence_fd, flags); + bo->idle = false; + bo->index = -1; + + /* Update brw_bo::offset64 */ + if (batch->exec_objects[i].offset != bo->offset64) { +DBG("BO %d migrated: 0x%" PRIx64 " -> 0x%llx\n", +bo->gem_handle, bo->offset64, batch->exec_objects[i].offset); +bo->offset64 = batch->exec_objects[i].offset; + } } + if (ret == 0 && out_fence != NULL) + *out_fence = execbuf.rsvd2 >> 32; + throttle(brw); } ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] svga: fix default case in svga_get_sample_position()
If called for an unsupported number of samples, always return (.5, .5). Fixes the Piglit arb_texture_multisample-fb-completeness test for unsupported sample counts, such as 2. Ideally, this function should not get called for unsupported sample counts, but that'll be additional work... --- src/gallium/drivers/svga/svga_surface.c | 1 + 1 file changed, 1 insertion(+) diff --git a/src/gallium/drivers/svga/svga_surface.c b/src/gallium/drivers/svga/svga_surface.c index d7c9850..64a85cf 100644 --- a/src/gallium/drivers/svga/svga_surface.c +++ b/src/gallium/drivers/svga/svga_surface.c @@ -899,6 +899,7 @@ svga_get_sample_position(struct pipe_context *context, break; default: positions = pos1; + sample_index = 0; } pos_out[0] = positions[sample_index][0]; -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 07/32] i965/miptree: Make layer_range_length return locical layers
On July 20, 2017 2:27:50 AM "Pohjolainen, Topi"wrote: This still leaves create_aux_state_map() using physical number of layers for the actual allocation. I toyed a little with this hoping to put it in front my i965-to-isl work. In the end it looks to me that moving away from physical is easier/cleaner once the conversion to isl is done. This prevents us from addressing both isl-based and native slice table-based leaving just the isl-based. Moreover in case of slice table-based we would need to calculate a layer-number divider as the slice table itself stores depth as physical. There are altogether 6 places calling get_num_phys_layers() in the end. I think these document nicely all the locations we need to consider. How does this sound? That's fine with me. On Wed, Jul 19, 2017 at 02:01:33PM -0700, Jason Ekstrand wrote: --- src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 14 ++ 1 file changed, 10 insertions(+), 4 deletions(-) diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c index 3eac077..8d8ea43 100644 --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c @@ -2461,12 +2461,18 @@ miptree_layer_range_length(const struct intel_mipmap_tree *mt, uint32_t level, assert(level <= mt->last_level); uint32_t total_num_layers; - if (mt->surf.size > 0) + if (mt->surf.size > 0) { total_num_layers = mt->surf.dim == ISL_SURF_DIM_3D ? - minify(mt->surf.phys_level0_sa.depth, level) : - mt->surf.phys_level0_sa.array_len; - else + minify(mt->surf.logical_level0_px.depth, level) : + mt->surf.logical_level0_px.array_len; + } else { total_num_layers = mt->level[level].depth; + if (mt->msaa_layout == INTEL_MSAA_LAYOUT_UMS || + mt->msaa_layout == INTEL_MSAA_LAYOUT_CMS) { + assert(total_num_layers % mt->num_samples == 0); + total_num_layers /= mt->num_samples; + } + } assert(start_layer < total_num_layers); if (num_layers == INTEL_REMAINING_LAYERS) -- 2.5.0.400.gff86faf ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] Android: fix spirv_info.c generation
On 20 July 2017 at 11:30, Chih-Wei Huangwrote: > It's incorrect to use $(LOCAL_PATH) in makefile recipes since it's > changing. The typical way to handle it is to use private variable. > Fortunately in this case we can just simplify them to $^. > > See further: > https://patchwork.freedesktop.org/patch/167718/ > > Also simplify LOCAL_GENERATED_SOURCES. > > Fixes: 2dd4e2ec (spirv: Generate spirv_info.c) > > Signed-off-by: Chih-Wei Huang Reviewed-by: Emil Velikov -Eiml ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 08/13] i965: Convert reloc.target_handle into an index for I915_EXEC_HANDLE_LUT
Quoting Kenneth Graunke (2017-07-19 23:36:58) > On Wednesday, July 19, 2017 3:09:16 AM PDT Chris Wilson wrote: > > #define READ_ONCE(x) (*(volatile __typeof__(x) *)&(x)) > > @@ -117,21 +125,12 @@ add_exec_bo(struct intel_batchbuffer *batch, struct > > brw_bo *bo) > > batch->exec_array_size * sizeof(batch->exec_objects[0])); > > } > > > > - struct drm_i915_gem_exec_object2 *validation_entry = > > - >exec_objects[batch->exec_count]; > > - validation_entry->handle = bo->gem_handle; > > - if (bo == batch->bo) { > > - validation_entry->relocation_count = batch->reloc_count; > > - validation_entry->relocs_ptr = (uintptr_t) batch->relocs; > > - } else { > > - validation_entry->relocation_count = 0; > > - validation_entry->relocs_ptr = 0; > > - } > > - validation_entry->alignment = bo->align; > > - validation_entry->offset = bo->offset64; > > - validation_entry->flags = bo->kflags; > > - validation_entry->rsvd1 = 0; > > - validation_entry->rsvd2 = 0; > > + struct drm_i915_gem_exec_object2 *exec = > > + memset(>exec_objects[batch->exec_count], 0, sizeof(*exec)); > > + exec->handle = bo->gem_handle; > > + exec->alignment = bo->align; > > + exec->offset = bo->offset64; > > + exec->flags = bo->kflags; > > I liked the name "validation_entry" given that we call this the "validation > list"...exec matches the struct name better, but I think validation_entry > helps distinguish the two lists... Hmm, how about - struct drm_i915_gem_exec_object2 *exec = - memset(>exec_objects[batch->exec_count], 0, sizeof(*exec)); - exec->handle = bo->gem_handle; - exec->alignment = bo->align; - exec->offset = bo->offset64; - exec->flags = bo->kflags; + batch->exec_objects[batch->exec_count] = (struct drm_i915_gem_exec_object2){ + .handle = bo->gem_handle, + .alignment = bo->align, + .offset = bo->offset64, + .flags = bo->kflags, + }; and skip the impossible problem of naming? But we still end up with a couple of struct drm_i915_gem_exec_object2 * validation_entry = >exec_objects[index]; Could I just call those exec_object? -Chris ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] android: fix spirv_info generation
On 20 July 2017 at 11:02, Chih-Wei Huangwrote: > 2017-07-19 15:12 GMT+08:00 Tapani Pälli : >> Depending on build order, LOCAL_PATH maybe set or not (and can't >> be trusted to have assumed path), change modifies all occurences >> of LOCAL_PATH as locally defined COMPILER_PATH instead. >> >> Signed-off-by: Tapani Pälli >> --- >> src/compiler/Android.nir.gen.mk | 38 -- >> 1 file changed, 20 insertions(+), 18 deletions(-) >> >> diff --git a/src/compiler/Android.nir.gen.mk >> b/src/compiler/Android.nir.gen.mk >> index 4507ac4..81511de 100644 >> --- a/src/compiler/Android.nir.gen.mk >> +++ b/src/compiler/Android.nir.gen.mk >> @@ -27,6 +27,8 @@ ifeq ($(LOCAL_MODULE_CLASS),) >> LOCAL_MODULE_CLASS := STATIC_LIBRARIES >> endif >> >> +COMPILER_PATH := $(MESA_TOP)/src/compiler >> + >> intermediates := $(call local-generated-sources-dir) >> >> LOCAL_SRC_FILES := $(LOCAL_SRC_FILES) >> @@ -48,48 +50,48 @@ MESA_GEN_NIR_H := $(addprefix $(call >> local-generated-sources-dir)/, \ >> nir/nir_opcodes.h \ >> nir/nir_builder_opcodes.h) >> >> -nir_builder_opcodes_gen := $(LOCAL_PATH)/nir/nir_builder_opcodes_h.py >> +nir_builder_opcodes_gen := $(COMPILER_PATH)/nir/nir_builder_opcodes_h.py >> nir_builder_opcodes_deps := \ >> - $(LOCAL_PATH)/nir/nir_opcodes.py \ >> - $(LOCAL_PATH)/nir/nir_builder_opcodes_h.py >> + $(COMPILER_PATH)/nir/nir_opcodes.py \ >> + $(COMPILER_PATH)/nir/nir_builder_opcodes_h.py >> >> $(intermediates)/nir/nir_builder_opcodes.h: $(nir_builder_opcodes_deps) >> @mkdir -p $(dir $@) >> $(hide) $(MESA_PYTHON2) $(nir_builder_opcodes_gen) $< > $@ >> >> -nir_constant_expressions_gen := >> $(LOCAL_PATH)/nir/nir_constant_expressions.py >> +nir_constant_expressions_gen := >> $(COMPILER_PATH)/nir/nir_constant_expressions.py >> nir_constant_expressions_deps := \ >> - $(LOCAL_PATH)/nir/nir_opcodes.py \ >> - $(LOCAL_PATH)/nir/nir_constant_expressions.py >> + $(COMPILER_PATH)/nir/nir_opcodes.py \ >> + $(COMPILER_PATH)/nir/nir_constant_expressions.py >> >> $(intermediates)/nir/nir_constant_expressions.c: >> $(nir_constant_expressions_deps) >> @mkdir -p $(dir $@) >> $(hide) $(MESA_PYTHON2) $(nir_constant_expressions_gen) $< > $@ >> >> -nir_opcodes_h_gen := $(LOCAL_PATH)/nir/nir_opcodes_h.py >> +nir_opcodes_h_gen := $(COMPILER_PATH)/nir/nir_opcodes_h.py >> nir_opcodes_h_deps := \ >> - $(LOCAL_PATH)/nir/nir_opcodes.py \ >> - $(LOCAL_PATH)/nir/nir_opcodes_h.py >> + $(COMPILER_PATH)/nir/nir_opcodes.py \ >> + $(COMPILER_PATH)/nir/nir_opcodes_h.py >> >> $(intermediates)/nir/nir_opcodes.h: $(nir_opcodes_h_deps) >> @mkdir -p $(dir $@) >> $(hide) $(MESA_PYTHON2) $(nir_opcodes_h_gen) $< > $@ >> >> -$(LOCAL_PATH)/nir/nir.h: $(intermediates)/nir/nir_opcodes.h >> +$(COMPILER_PATH)/nir/nir.h: $(intermediates)/nir/nir_opcodes.h >> >> -nir_opcodes_c_gen := $(LOCAL_PATH)/nir/nir_opcodes_c.py >> +nir_opcodes_c_gen := $(COMPILER_PATH)/nir/nir_opcodes_c.py >> nir_opcodes_c_deps := \ >> - $(LOCAL_PATH)/nir/nir_opcodes.py \ >> - $(LOCAL_PATH)/nir/nir_opcodes_c.py >> + $(COMPILER_PATH)/nir/nir_opcodes.py \ >> + $(COMPILER_PATH)/nir/nir_opcodes_c.py >> >> $(intermediates)/nir/nir_opcodes.c: $(nir_opcodes_c_deps) >> @mkdir -p $(dir $@) >> $(hide) $(MESA_PYTHON2) $(nir_opcodes_c_gen) $< > $@ >> >> -nir_opt_algebraic_gen := $(LOCAL_PATH)/nir/nir_opt_algebraic.py >> +nir_opt_algebraic_gen := $(COMPILER_PATH)/nir/nir_opt_algebraic.py >> nir_opt_algebraic_deps := \ >> - $(LOCAL_PATH)/nir/nir_opt_algebraic.py \ >> - $(LOCAL_PATH)/nir/nir_algebraic.py >> + $(COMPILER_PATH)/nir/nir_opt_algebraic.py \ >> + $(COMPILER_PATH)/nir/nir_algebraic.py >> >> $(intermediates)/nir/nir_opt_algebraic.c: $(nir_opt_algebraic_deps) >> @mkdir -p $(dir $@) >> @@ -98,6 +100,6 @@ $(intermediates)/nir/nir_opt_algebraic.c: >> $(nir_opt_algebraic_deps) >> LOCAL_GENERATED_SOURCES += $(addprefix $(intermediates)/, \ >> $(SPIRV_GENERATED_FILES)) >> >> -$(intermediates)/spirv/spirv_info.c: $(LOCAL_PATH)/spirv/spirv_info_c.py >> $(LOCAL_PATH)/spirv/spirv.core.grammar.json >> +$(intermediates)/spirv/spirv_info.c: $(COMPILER_PATH)/spirv/spirv_info_c.py >> $(COMPILER_PATH)/spirv/spirv.core.grammar.json >> @mkdir -p $(dir $@) >> - $(hide) $(MESA_PYTHON2) $(LOCAL_PATH)/spirv/spirv_info_c.py >> $(LOCAL_PATH)/spirv/spirv.core.grammar.json $@ || ($(RM) $@; false) >> + $(hide) $(MESA_PYTHON2) $(COMPILER_PATH)/spirv/spirv_info_c.py >> $(COMPILER_PATH)/spirv/spirv.core.grammar.json $@ || ($(RM) $@; false) > > OK. I see the real problem. > The rules to build spirv_info.c are incorrectly > to use $(LOCAL_PATH). > Basically speaking, $(LOCAL_PATH) can't be used > in the recipes[1] since it is always changing. > When the recipe rules
Re: [Mesa-dev] [RFC PATCH] radeonsi: set a per-buffer flag that disables inter-process sharing (v2)
On Jul 19, 2017 10:21 PM, "zhoucm1"wrote: On 2017年07月19日 23:34, Marek Olšák wrote: On Jul 19, 2017 3:36 AM, "zhoucm1" wrote: On 2017年07月19日 04:08, Marek Olšák wrote: > From: Marek Olšák > > For lower overhead in the CS ioctl. > Winsys allocators are not used with interprocess-sharable resources. > Hi Marek, Could I know from how your this way reduces overhead in CS ioctl? reusing BO to short bo list? The kernel part of the work hasn't been done yet. The idea is that nonsharable buffers don't have to be revalidated by TTM, OK, Maybe I only can see the whole picture of this idea when you complete kernel part. Out of curious, why/how can nonsharable buffers be revalidated by TTM without exposing like amdgpu_bo_make_resident api? I think the idea is that all nonsharable buffers will be backed by the same reservation object, so TTM can skip buffer validation if no buffer has been moved. It's just an optimization for the current design. With mentioned in another thread, if we can expose make_resident api, we can remove bo_list, even we can remove reservation operation in CS ioctl. And now, I think our bo list is a very bad design, first, umd must create bo list for every command submission, this is a extra cpu overhead compared with traditional way. second, kernel also have to iterate the list, when bo list is too long, like OpenCL program, they always throw several thousands BOs to bo list, reservation must keep these thousands ww_mutex safe, CPU overhead is too big. So I strongly suggest we should expose make_resident api to user space. if cannot, I want to know any specific reason to see if we can solve it. Yeah, I think the BO list idea is likely to die sooner or later. It made sense for GL before bindless was a thing. Nowadays I don't see much value in it. MesaGL will keep tracking the BO list because it's a requirement for good GL performance (it determines whether to flush IBs before BO synchronization, it allows tracking fences for each BO, which are used to determine dependencies between IBs, and that all allows async SDMA and async compute for GL, which doesn't have separate queues). However, we don't need any BO list at the libdrm level and lower. I think a BO_CREATE flag that causes that the buffer is added to a kernel-side per-fd BO list would be sufficient. How the kernel manages its BO list should be its own implementation detail. Initially we can just move the current BO list management into the kernel. Marek Regards, David Zhou so it should remove a lot of kernel overhead and the BO list remains the same. Marek Thanks, David Zhou > v2: It shouldn't crash anymore, but the kernel will reject the new flag. > --- > src/gallium/drivers/radeon/r600_buffer_common.c | 7 + > src/gallium/drivers/radeon/radeon_winsys.h | 20 +++--- > src/gallium/winsys/amdgpu/drm/amdgpu_bo.c | 36 > - > src/gallium/winsys/radeon/drm/radeon_drm_bo.c | 27 +++ > 4 files changed, 62 insertions(+), 28 deletions(-) > > diff --git a/src/gallium/drivers/radeon/r600_buffer_common.c > b/src/gallium/drivers/radeon/r600_buffer_common.c > index dd1c209..2747ac4 100644 > --- a/src/gallium/drivers/radeon/r600_buffer_common.c > +++ b/src/gallium/drivers/radeon/r600_buffer_common.c > @@ -160,20 +160,27 @@ void r600_init_resource_fields(struct > r600_common_screen *rscreen, > } > /* Tiled textures are unmappable. Always put them in VRAM. */ > if ((res->b.b.target != PIPE_BUFFER && !rtex->surface.is_linear) || > res->flags & R600_RESOURCE_FLAG_UNMAPPABLE) { > res->domains = RADEON_DOMAIN_VRAM; > res->flags |= RADEON_FLAG_NO_CPU_ACCESS | > RADEON_FLAG_GTT_WC; > } > + /* Only displayable single-sample textures can be shared between > +* processes. */ > + if (res->b.b.target == PIPE_BUFFER || > + res->b.b.nr_samples >= 2 || > + rtex->surface.micro_tile_mode != RADEON_MICRO_MODE_DISPLAY) > + res->flags |= RADEON_FLAG_NO_INTERPROCESS_SHARING; > + > /* If VRAM is just stolen system memory, allow both VRAM and > * GTT, whichever has free space. If a buffer is evicted from > * VRAM to GTT, it will stay there. > * > * DRM 3.6.0 has good BO move throttling, so we can allow VRAM-only > * placements even with a low amount of stolen VRAM. > */ > if (!rscreen->info.has_dedicated_vram && > (rscreen->info.drm_major < 3 || rscreen->info.drm_minor < 6) && > res->domains == RADEON_DOMAIN_VRAM) { > diff --git a/src/gallium/drivers/radeon/radeon_winsys.h > b/src/gallium/drivers/radeon/radeon_winsys.h > index 351edcd..0abcb56 100644 > --- a/src/gallium/drivers/radeon/radeon_winsys.h > +++ b/src/gallium/drivers/radeon/radeon_winsys.h > @@
Re: [Mesa-dev] [PATCH 2/2] egl: add linux-dmabuf-unstable-v1-protocol.c to "nodist"
On Thu, 2017-07-20 at 14:37 +0100, Emil Velikov wrote: > On 20 July 2017 at 13:54, Daniel Stonewrote: > > Hi Emil, > > > > On 20 July 2017 at 13:51, Emil Velikov wrote: > > > On 19 July 2017 at 23:44, Andres Gomez wrote: > > > > -dri2_backend_FILES += drivers/dri2/platform_wayland.c \ > > > > - drivers/dri2/linux-dmabuf-unstable-v1-protocol.c > > > > +dri2_backend_FILES += drivers/dri2/platform_wayland.c > > > > +nodist_dri2_backend_FILES += > > > > drivers/dri2/linux-dmabuf-unstable-v1-protocol.c > > > > endif > > > > > > > > if HAVE_PLATFORM_DRM > > > > @@ -119,6 +122,9 @@ libEGL_common_la_SOURCES += \ > > > > $(dri2_backend_FILES) \ > > > > $(dri3_backend_FILES) > > > > > > > > +nodist_libEGL_common_la_SOURCES += \ > > > > + $(nodist_dri2_backend_FILES) > > > > + > > > > > > Just add the files two generated file to BUILT_SOURCES. That's the way > > > we handle it through the tree. > > > > They're already in BUILT_SOURCES (line 78 in master), so I guess > > something else is wrong. > > > > Thanks for the correction Dan. > > Upon a second look - BUILT_SOURCES line is in a conditional which is > causing the issue. Moving it outside solves the problem on my system. Mmmm ... the BUILT_SOURCES is in the same conditional in which the linux-dmabuf-unstable-v1-protocol.c target is define so I don't think that would be solving the problem and it doesn't seem to be doing so in my travis: https://travis-ci.org/Igalia/release-mesa/jobs/255687844 -- Br, Andres signature.asc Description: This is a digitally signed message part ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 02/13] i965: Allow passing target_bo=NULL to brw_emit_reloc()
Quoting Chris Wilson (2017-07-20 15:15:02) > Quoting Kenneth Graunke (2017-07-19 21:08:23) > > On Wednesday, July 19, 2017 3:09:10 AM PDT Chris Wilson wrote: > > > Sometimes we want to emit a relocation to a NULL surface when the > > > constructing the batch. If we push the NULL handling into the common > > > brw_emit_reloc() we can make the batch construction itself more > > > readable. > > > > I don't like this... > > > > There is no such thing as a "relocation to a NULL surface". No relocation > > is emittted in this case. It either means the field is relative to a base > > address, and is simply an offset, or the address is unused and we're setting > > a NULL pointer combined with other bits packed into the same DWord. > > There's actually no such thing as a relocation ;) I only put it out > because there were so many duplicated checks. The difference is that in the always use brw_emit_reloc patch end up with @@ -661,18 +662,17 @@ brw_emit_buffer_surface_state(struct brw_context *brw, out_offset); isl_buffer_fill_state(>isl_dev, dw, - .address = (bo ? bo->offset64 : 0) + buffer_offset, + .address = (bo ? + brw_emit_reloc(>batch, +*out_offset + brw->isl_dev.ss.addr_offset, +bo, buffer_offset, +I915_GEM_DOMAIN_SAMPLER, +(rw ? I915_GEM_DOMAIN_SAMPLER : 0)) : + buffer_offset), .size = buffer_size, .format = surface_format, .stride = pitch, .mocs = tex_mocs[brw->gen]); which was a bit too much of an eyesore for me. -Chris ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] mesa: include texture size in error messages
Reviewed-by: Alejandro PiñeiroOn 20/07/17 15:56, Brian Paul wrote: > --- > src/mesa/main/teximage.c | 9 + > 1 file changed, 5 insertions(+), 4 deletions(-) > > diff --git a/src/mesa/main/teximage.c b/src/mesa/main/teximage.c > index 5e13025..c30f8ac 100644 > --- a/src/mesa/main/teximage.c > +++ b/src/mesa/main/teximage.c > @@ -3007,8 +3007,8 @@ teximage(struct gl_context *ctx, GLboolean compressed, > GLuint dims, > >if (!dimensionsOK) { > _mesa_error(ctx, GL_INVALID_VALUE, > - "%s%uD(invalid width or height or depth)", > - func, dims); > + "%s%uD(invalid width=%d or height=%d or depth=%d)", > + func, dims, width, height, depth); > return; >} > > @@ -3833,7 +3833,8 @@ copyteximage(struct gl_context *ctx, GLuint dims, >if (!_mesa_legal_texture_dimensions(ctx, target, level, width, height, >1, border)) { > _mesa_error(ctx, GL_INVALID_VALUE, > - "glCopyTexImage%uD(invalid width or height)", dims); > + "glCopyTexImage%uD(invalid width=%d or height=%d)", > + dims, width, height); > return; >} > } > @@ -5743,7 +5744,7 @@ texture_image_multisample(struct gl_context *ctx, > GLuint dims, > else { >if (!dimensionsOK) { > _mesa_error(ctx, GL_INVALID_VALUE, > - "%s(invalid width or height)", func); > + "%s(invalid width=%d or height=%d)", func, width, > height); > return; >} > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] mesa: include texture size in error messages
Reviewed-by: Samuel PitoisetOn 07/20/2017 03:56 PM, Brian Paul wrote: --- src/mesa/main/teximage.c | 9 + 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/src/mesa/main/teximage.c b/src/mesa/main/teximage.c index 5e13025..c30f8ac 100644 --- a/src/mesa/main/teximage.c +++ b/src/mesa/main/teximage.c @@ -3007,8 +3007,8 @@ teximage(struct gl_context *ctx, GLboolean compressed, GLuint dims, if (!dimensionsOK) { _mesa_error(ctx, GL_INVALID_VALUE, - "%s%uD(invalid width or height or depth)", - func, dims); + "%s%uD(invalid width=%d or height=%d or depth=%d)", + func, dims, width, height, depth); return; } @@ -3833,7 +3833,8 @@ copyteximage(struct gl_context *ctx, GLuint dims, if (!_mesa_legal_texture_dimensions(ctx, target, level, width, height, 1, border)) { _mesa_error(ctx, GL_INVALID_VALUE, - "glCopyTexImage%uD(invalid width or height)", dims); + "glCopyTexImage%uD(invalid width=%d or height=%d)", + dims, width, height); return; } } @@ -5743,7 +5744,7 @@ texture_image_multisample(struct gl_context *ctx, GLuint dims, else { if (!dimensionsOK) { _mesa_error(ctx, GL_INVALID_VALUE, - "%s(invalid width or height)", func); + "%s(invalid width=%d or height=%d)", func, width, height); return; } ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 02/13] i965: Allow passing target_bo=NULL to brw_emit_reloc()
Quoting Kenneth Graunke (2017-07-19 21:08:23) > On Wednesday, July 19, 2017 3:09:10 AM PDT Chris Wilson wrote: > > Sometimes we want to emit a relocation to a NULL surface when the > > constructing the batch. If we push the NULL handling into the common > > brw_emit_reloc() we can make the batch construction itself more > > readable. > > I don't like this... > > There is no such thing as a "relocation to a NULL surface". No relocation > is emittted in this case. It either means the field is relative to a base > address, and is simply an offset, or the address is unused and we're setting > a NULL pointer combined with other bits packed into the same DWord. There's actually no such thing as a relocation ;) I only put it out because there were so many duplicated checks. -Chris ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 101467] swr driver leaks memory (texture management)
https://bugs.freedesktop.org/show_bug.cgi?id=101467 Andrés Gómez Garcíachanged: What|Removed |Added Resolution|--- |FIXED Status|REOPENED|RESOLVED --- Comment #8 from Andrés Gómez García --- (In reply to Bruce Cherniak from comment #7) > Available in 17.1.5 with commit 5c91fcfa. Thanks Bruce, I should actually have noted that myself. I suppose we can close, then ... -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] mesa: include texture size in error messages
--- src/mesa/main/teximage.c | 9 + 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/src/mesa/main/teximage.c b/src/mesa/main/teximage.c index 5e13025..c30f8ac 100644 --- a/src/mesa/main/teximage.c +++ b/src/mesa/main/teximage.c @@ -3007,8 +3007,8 @@ teximage(struct gl_context *ctx, GLboolean compressed, GLuint dims, if (!dimensionsOK) { _mesa_error(ctx, GL_INVALID_VALUE, - "%s%uD(invalid width or height or depth)", - func, dims); + "%s%uD(invalid width=%d or height=%d or depth=%d)", + func, dims, width, height, depth); return; } @@ -3833,7 +3833,8 @@ copyteximage(struct gl_context *ctx, GLuint dims, if (!_mesa_legal_texture_dimensions(ctx, target, level, width, height, 1, border)) { _mesa_error(ctx, GL_INVALID_VALUE, - "glCopyTexImage%uD(invalid width or height)", dims); + "glCopyTexImage%uD(invalid width=%d or height=%d)", + dims, width, height); return; } } @@ -5743,7 +5744,7 @@ texture_image_multisample(struct gl_context *ctx, GLuint dims, else { if (!dimensionsOK) { _mesa_error(ctx, GL_INVALID_VALUE, - "%s(invalid width or height)", func); + "%s(invalid width=%d or height=%d)", func, width, height); return; } -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] egl: add linux-dmabuf-unstable-v1-protocol.c to "nodist"
On 20 July 2017 at 13:54, Daniel Stonewrote: > Hi Emil, > > On 20 July 2017 at 13:51, Emil Velikov wrote: >> On 19 July 2017 at 23:44, Andres Gomez wrote: >>> -dri2_backend_FILES += drivers/dri2/platform_wayland.c \ >>> - drivers/dri2/linux-dmabuf-unstable-v1-protocol.c >>> +dri2_backend_FILES += drivers/dri2/platform_wayland.c >>> +nodist_dri2_backend_FILES += >>> drivers/dri2/linux-dmabuf-unstable-v1-protocol.c >>> endif >>> >>> if HAVE_PLATFORM_DRM >>> @@ -119,6 +122,9 @@ libEGL_common_la_SOURCES += \ >>> $(dri2_backend_FILES) \ >>> $(dri3_backend_FILES) >>> >>> +nodist_libEGL_common_la_SOURCES += \ >>> + $(nodist_dri2_backend_FILES) >>> + >> Just add the files two generated file to BUILT_SOURCES. That's the way >> we handle it through the tree. > > They're already in BUILT_SOURCES (line 78 in master), so I guess > something else is wrong. > Thanks for the correction Dan. Upon a second look - BUILT_SOURCES line is in a conditional which is causing the issue. Moving it outside solves the problem on my system. -Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] egl: add linux-dmabuf-unstable-v1-protocol.c to "nodist"
On Thursday, 2017-07-20 09:09:37 +0100, Daniel Stone wrote: > On 19 July 2017 at 23:44, Andres Gomezwrote: > > This fixes `make distcheck` > > > >> make[3]: *** No rule to make target > >> 'drivers/dri2/linux-dmabuf-unstable-v1-protocol.c', needed by 'distdir'. > >> Stop. > >> make[3]: Entering directory '/home/local/mesa/src/egl' > >> make[3]: Leaving directory '/home/local/mesa/src/egl' > >> make[2]: *** [distdir] Error 1 > >> make[1]: *** [distdir] Error 1 > >> make: *** [dist] Error 2 > > Reviewed-by: Daniel Stone > > Mark - could you please insert a 'distcheck' run into CI? Doing this on travis, I'll let someone else figure out if/how appveyor can do that. > > Cheers, > Daniel ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 101467] swr driver leaks memory (texture management)
https://bugs.freedesktop.org/show_bug.cgi?id=101467 --- Comment #7 from Bruce Cherniak--- Available in 17.1.5 with commit 5c91fcfa. -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] egl: add linux-dmabuf-unstable-v1-protocol.c to "nodist"
Hi Emil, On 20 July 2017 at 13:51, Emil Velikovwrote: > On 19 July 2017 at 23:44, Andres Gomez wrote: >> -dri2_backend_FILES += drivers/dri2/platform_wayland.c \ >> - drivers/dri2/linux-dmabuf-unstable-v1-protocol.c >> +dri2_backend_FILES += drivers/dri2/platform_wayland.c >> +nodist_dri2_backend_FILES += >> drivers/dri2/linux-dmabuf-unstable-v1-protocol.c >> endif >> >> if HAVE_PLATFORM_DRM >> @@ -119,6 +122,9 @@ libEGL_common_la_SOURCES += \ >> $(dri2_backend_FILES) \ >> $(dri3_backend_FILES) >> >> +nodist_libEGL_common_la_SOURCES += \ >> + $(nodist_dri2_backend_FILES) >> + > Just add the files two generated file to BUILT_SOURCES. That's the way > we handle it through the tree. They're already in BUILT_SOURCES (line 78 in master), so I guess something else is wrong. Cheers, Daniel ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] egl: add linux-dmabuf-unstable-v1-protocol.c to "nodist"
On 19 July 2017 at 23:44, Andres Gomezwrote: > This fixes `make distcheck` > >> make[3]: *** No rule to make target >> 'drivers/dri2/linux-dmabuf-unstable-v1-protocol.c', needed by 'distdir'. >> Stop. >> make[3]: Entering directory '/home/local/mesa/src/egl' >> make[3]: Leaving directory '/home/local/mesa/src/egl' >> make[2]: *** [distdir] Error 1 >> make[1]: *** [distdir] Error 1 >> make: *** [dist] Error 2 > > Fixes: 02cc359372 ("egl/wayland: Use linux-dmabuf interface for buffers") > Cc: Emil Velikov > Signed-off-by: Andres Gomez > --- > src/egl/Makefile.am | 10 -- > 1 file changed, 8 insertions(+), 2 deletions(-) > > diff --git a/src/egl/Makefile.am b/src/egl/Makefile.am > index 7c1a4929b8..6ee1fb9be8 100644 > --- a/src/egl/Makefile.am > +++ b/src/egl/Makefile.am > @@ -44,10 +44,13 @@ noinst_LTLIBRARIES = libEGL_common.la > libEGL_common_la_SOURCES = \ > $(LIBEGL_C_FILES) > > +nodist_libEGL_common_la_SOURCES = > + > libEGL_common_la_LIBADD = \ > $(EGL_LIB_DEPS) > > dri2_backend_FILES = > +nodist_dri2_backend_FILES = > dri3_backend_FILES = > > if HAVE_PLATFORM_X11 > @@ -84,8 +87,8 @@ libEGL_common_la_LIBADD += $(WAYLAND_LIBS) > libEGL_common_la_LIBADD += $(LIBDRM_LIBS) > libEGL_common_la_LIBADD += > $(top_builddir)/src/egl/wayland/wayland-drm/libwayland-drm.la > libEGL_common_la_LIBADD += $(top_builddir)/src/util/libmesautil.la > -dri2_backend_FILES += drivers/dri2/platform_wayland.c \ > - drivers/dri2/linux-dmabuf-unstable-v1-protocol.c > +dri2_backend_FILES += drivers/dri2/platform_wayland.c > +nodist_dri2_backend_FILES += drivers/dri2/linux-dmabuf-unstable-v1-protocol.c > endif > > if HAVE_PLATFORM_DRM > @@ -119,6 +122,9 @@ libEGL_common_la_SOURCES += \ > $(dri2_backend_FILES) \ > $(dri3_backend_FILES) > > +nodist_libEGL_common_la_SOURCES += \ > + $(nodist_dri2_backend_FILES) > + Just add the files two generated file to BUILT_SOURCES. That's the way we handle it through the tree. -Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] broadcom: correct header file in BROADCOM_FILES
On 19 July 2017 at 23:44, Andres Gomezwrote: > This fixes `make distcheck` > >> make[3]: *** No rule to make target 'common/v3d_devinfo.h', needed by >> 'distdir'. Stop. >> make[3]: Leaving directory '/home/local/mesa/src/broadcom' >> Makefile:945: recipe for target 'distdir' failed >> make[2]: Leaving directory '/home/local/mesa/src' >> make[2]: *** [distdir] Error 1 >> make[1]: *** [distdir] Error 1 > > Fixes: 42799c ("broadcom: Introduce a header for talking about chip > revisions.") > Cc: Emil Velikov > Signed-off-by: Andres Gomez Reviewed-by: Emil Velikov Thanks Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] Android: fix spirv_info.c generation
Nice, I think it was done like this in some other occurence too; Reviewed-by: Tapani PälliOn 07/20/2017 01:30 PM, Chih-Wei Huang wrote: It's incorrect to use $(LOCAL_PATH) in makefile recipes since it's changing. The typical way to handle it is to use private variable. Fortunately in this case we can just simplify them to $^. See further: https://patchwork.freedesktop.org/patch/167718/ Also simplify LOCAL_GENERATED_SOURCES. Fixes: 2dd4e2ec (spirv: Generate spirv_info.c) Signed-off-by: Chih-Wei Huang --- src/compiler/Android.nir.gen.mk | 7 ++- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/src/compiler/Android.nir.gen.mk b/src/compiler/Android.nir.gen.mk index 4507ac4..e2187d0 100644 --- a/src/compiler/Android.nir.gen.mk +++ b/src/compiler/Android.nir.gen.mk @@ -41,7 +41,7 @@ LOCAL_EXPORT_C_INCLUDE_DIRS += \ $(MESA_TOP)/src/compiler/nir LOCAL_GENERATED_SOURCES += $(addprefix $(intermediates)/, \ - $(NIR_GENERATED_FILES)) + $(NIR_GENERATED_FILES) $(SPIRV_GENERATED_FILES)) # Modules using libmesa_nir must set LOCAL_GENERATED_SOURCES to this MESA_GEN_NIR_H := $(addprefix $(call local-generated-sources-dir)/, \ @@ -95,9 +95,6 @@ $(intermediates)/nir/nir_opt_algebraic.c: $(nir_opt_algebraic_deps) @mkdir -p $(dir $@) $(hide) $(MESA_PYTHON2) $(nir_opt_algebraic_gen) $< > $@ -LOCAL_GENERATED_SOURCES += $(addprefix $(intermediates)/, \ - $(SPIRV_GENERATED_FILES)) - $(intermediates)/spirv/spirv_info.c: $(LOCAL_PATH)/spirv/spirv_info_c.py $(LOCAL_PATH)/spirv/spirv.core.grammar.json @mkdir -p $(dir $@) - $(hide) $(MESA_PYTHON2) $(LOCAL_PATH)/spirv/spirv_info_c.py $(LOCAL_PATH)/spirv/spirv.core.grammar.json $@ || ($(RM) $@; false) + $(hide) $(MESA_PYTHON2) $^ $@ || ($(RM) $@; false) ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH V2] i965 : Optimize atom state flag checks
From: Aravindan MuthukumarThis patch improves CPI Rate(Cycles per Instruction) and branch mispredict for i965. The function check_state() was showing CPI retired rate. Performance stats with android: CPI retired lowered by 28% (lower is better) Branch missprediction lowered by 13% (lower is better) 3DMark improved by 2% The dissassembly doesn't show difference, although above results were observed with patch. Signed-off-by: Aravindan Muthukumar Signedd-off-by: Yogesh Marathe Tested-by: Asish --- Changes since V1: - Removed memset() change - Changed commit message as per review comments src/mesa/drivers/dri/i965/brw_defines.h | 4 src/mesa/drivers/dri/i965/brw_state_upload.c | 12 2 files changed, 12 insertions(+), 4 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_defines.h b/src/mesa/drivers/dri/i965/brw_defines.h index 2a8dbf8..8c9a510 100644 --- a/src/mesa/drivers/dri/i965/brw_defines.h +++ b/src/mesa/drivers/dri/i965/brw_defines.h @@ -1687,3 +1687,7 @@ enum brw_pixel_shader_coverage_mask_mode { # define CSDBG2_CONSTANT_BUFFER_ADDRESS_OFFSET_DISABLE (1 << 4) #endif + +/* Checking the state of mesa and brw before emitting atoms */ +#define CHECK_BRW_STATE(a,b) ((a.mesa & b.mesa) | (a.brw & b.brw)) + diff --git a/src/mesa/drivers/dri/i965/brw_state_upload.c b/src/mesa/drivers/dri/i965/brw_state_upload.c index acaa97e..1c8b969 100644 --- a/src/mesa/drivers/dri/i965/brw_state_upload.c +++ b/src/mesa/drivers/dri/i965/brw_state_upload.c @@ -443,10 +443,8 @@ check_and_emit_atom(struct brw_context *brw, struct brw_state_flags *state, const struct brw_tracked_state *atom) { - if (check_state(state, >dirty)) { atom->emit(brw); merge_ctx_state(brw, state); - } } static inline void @@ -541,7 +539,10 @@ brw_upload_pipeline_state(struct brw_context *brw, const struct brw_tracked_state *atom = [i]; struct brw_state_flags generated; - check_and_emit_atom(brw, , atom); + /* Checking the state and emitting atoms */ + if (CHECK_BRW_STATE(state, atom->dirty)) { +check_and_emit_atom(brw, , atom); + } accumulate_state(, >dirty); @@ -558,7 +559,10 @@ brw_upload_pipeline_state(struct brw_context *brw, for (i = 0; i < num_atoms; i++) { const struct brw_tracked_state *atom = [i]; - check_and_emit_atom(brw, , atom); + /* Checking the state and emitting atoms */ + if (CHECK_BRW_STATE(state, atom->dirty)) { +check_and_emit_atom(brw, , atom); + } } } -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [RFC PATCH] radeonsi: set a per-buffer flag that disables inter-process sharing (v2)
On 07/20/2017 04:20 AM, zhoucm1 wrote: On 2017年07月19日 23:34, Marek Olšák wrote: On Jul 19, 2017 3:36 AM, "zhoucm1"> wrote: On 2017年07月19日 04:08, Marek Olšák wrote: From: Marek Olšák > For lower overhead in the CS ioctl. Winsys allocators are not used with interprocess-sharable resources. Hi Marek, Could I know from how your this way reduces overhead in CS ioctl? reusing BO to short bo list? The kernel part of the work hasn't been done yet. The idea is that nonsharable buffers don't have to be revalidated by TTM, OK, Maybe I only can see the whole picture of this idea when you complete kernel part. Out of curious, why/how can nonsharable buffers be revalidated by TTM without exposing like amdgpu_bo_make_resident api? With mentioned in another thread, if we can expose make_resident api, we can remove bo_list, even we can remove reservation operation in CS ioctl. And now, I think our bo list is a very bad design, first, umd must create bo list for every command submission, this is a extra cpu overhead compared with traditional way. second, kernel also have to iterate the list, when bo list is too long, like OpenCL program, they always throw several thousands BOs to bo list, reservation must keep these thousands ww_mutex safe, CPU overhead is too big. So I strongly suggest we should expose make_resident api to user space. if cannot, I want to know any specific reason to see if we can solve it. Introducing a make_resident API will also help ARB_bindless_texture a lot, because currently when a texture is marked as resident, we have to re-validate the related buffers for every new CS, like traditional buffers. With a resident BO list the whole mechanism could be skipped. Regards, David Zhou so it should remove a lot of kernel overhead and the BO list remains the same. Marek Thanks, David Zhou v2: It shouldn't crash anymore, but the kernel will reject the new flag. --- src/gallium/drivers/radeon/r600_buffer_common.c | 7 + src/gallium/drivers/radeon/radeon_winsys.h | 20 +++--- src/gallium/winsys/amdgpu/drm/amdgpu_bo.c| 36 - src/gallium/winsys/radeon/drm/radeon_drm_bo.c | 27 +++ 4 files changed, 62 insertions(+), 28 deletions(-) diff --git a/src/gallium/drivers/radeon/r600_buffer_common.c b/src/gallium/drivers/radeon/r600_buffer_common.c index dd1c209..2747ac4 100644 --- a/src/gallium/drivers/radeon/r600_buffer_common.c +++ b/src/gallium/drivers/radeon/r600_buffer_common.c @@ -160,20 +160,27 @@ void r600_init_resource_fields(struct r600_common_screen *rscreen, } /* Tiled textures are unmappable. Always put them in VRAM. */ if ((res->b.b.target != PIPE_BUFFER && !rtex->surface.is_linear) || res->flags & R600_RESOURCE_FLAG_UNMAPPABLE) { res->domains = RADEON_DOMAIN_VRAM; res->flags |= RADEON_FLAG_NO_CPU_ACCESS | RADEON_FLAG_GTT_WC; } + /* Only displayable single-sample textures can be shared between +* processes. */ + if (res->b.b.target == PIPE_BUFFER || + res->b.b.nr_samples >= 2 || + rtex->surface.micro_tile_mode != RADEON_MICRO_MODE_DISPLAY) + res->flags |= RADEON_FLAG_NO_INTERPROCESS_SHARING; + /* If VRAM is just stolen system memory, allow both VRAM and * GTT, whichever has free space. If a buffer is evicted from * VRAM to GTT, it will stay there. * * DRM 3.6.0 has good BO move throttling, so we can allow VRAM-only * placements even with a low amount of stolen VRAM. */ if (!rscreen->info.has_dedicated_vram && (rscreen->info.drm_major < 3 || rscreen->info.drm_minor < 6) && res->domains == RADEON_DOMAIN_VRAM) { diff --git a/src/gallium/drivers/radeon/radeon_winsys.h b/src/gallium/drivers/radeon/radeon_winsys.h index 351edcd..0abcb56 100644 --- a/src/gallium/drivers/radeon/radeon_winsys.h +++ b/src/gallium/drivers/radeon/radeon_winsys.h @@ -47,20 +47,21 @@ enum radeon_bo_domain { /* bitfield */ RADEON_DOMAIN_GTT = 2, RADEON_DOMAIN_VRAM = 4, RADEON_DOMAIN_VRAM_GTT = RADEON_DOMAIN_VRAM | RADEON_DOMAIN_GTT }; enum radeon_bo_flag { /* bitfield */ RADEON_FLAG_GTT_WC =
Re: [Mesa-dev] [PATCH 10/10] mesa: remove useless assert in _mesa_TextureView()
Series: Reviewed-by: Timothy ArceriOn 20/07/17 19:53, Samuel Pitoiset wrote: Already checked in _mesa_choose_texture_format(). Signed-off-by: Samuel Pitoiset --- src/mesa/main/textureview.c | 1 - 1 file changed, 1 deletion(-) diff --git a/src/mesa/main/textureview.c b/src/mesa/main/textureview.c index ed66c17958..cef9caff41 100644 --- a/src/mesa/main/textureview.c +++ b/src/mesa/main/textureview.c @@ -633,7 +633,6 @@ _mesa_TextureView(GLuint texture, GLenum target, GLuint origtexture, texFormat = _mesa_choose_texture_format(ctx, texObj, target, 0, internalformat, GL_NONE, GL_NONE); - assert(texFormat != MESA_FORMAT_NONE); if (texFormat == MESA_FORMAT_NONE) return; newViewNumLevels = MIN2(numlevels, origTexObj->NumLevels - minlevel); ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] android: fix spirv_info generation
2017-07-20 18:02 GMT+08:00 Chih-Wei Huang: > > OK. I see the real problem. > The rules to build spirv_info.c are incorrectly > to use $(LOCAL_PATH). > Basically speaking, $(LOCAL_PATH) can't be used > in the recipes[1] since it is always changing. > When the recipe rules are executed its value > is not you expected. > (using it in targets and prerequisites is OK) > > The typical way to handle it is to use private variable: > > $(intermediates)/spirv/spirv_info.c: PRIVATE_LOCAL_PATH := $(LOCAL_PATH) > > Then use PRIVATE_LOCAL_PATH in the recpies. > > But in this case, seems it can just be simplified to $^ > (which means all the prerequisites) > > $(intermediates)/spirv/spirv_info.c: > $(COMPILER_PATH)/spirv/spirv_info_c.py > $(COMPILER_PATH)/spirv/spirv.core.grammar.json Sorry. I meant $(LOCAL_PATH) instead of $(COMPILER_PATH). (copied wrong lines) I've submitted a real patch for it. > @mkdir -p $(dir $@) > $(hide) $(MESA_PYTHON2) $^ $@ || ($(RM) $@; false) -- Chih-Wei Android-x86 project http://www.android-x86.org ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] Android: fix spirv_info.c generation
It's incorrect to use $(LOCAL_PATH) in makefile recipes since it's changing. The typical way to handle it is to use private variable. Fortunately in this case we can just simplify them to $^. See further: https://patchwork.freedesktop.org/patch/167718/ Also simplify LOCAL_GENERATED_SOURCES. Fixes: 2dd4e2ec (spirv: Generate spirv_info.c) Signed-off-by: Chih-Wei Huang--- src/compiler/Android.nir.gen.mk | 7 ++- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/src/compiler/Android.nir.gen.mk b/src/compiler/Android.nir.gen.mk index 4507ac4..e2187d0 100644 --- a/src/compiler/Android.nir.gen.mk +++ b/src/compiler/Android.nir.gen.mk @@ -41,7 +41,7 @@ LOCAL_EXPORT_C_INCLUDE_DIRS += \ $(MESA_TOP)/src/compiler/nir LOCAL_GENERATED_SOURCES += $(addprefix $(intermediates)/, \ - $(NIR_GENERATED_FILES)) + $(NIR_GENERATED_FILES) $(SPIRV_GENERATED_FILES)) # Modules using libmesa_nir must set LOCAL_GENERATED_SOURCES to this MESA_GEN_NIR_H := $(addprefix $(call local-generated-sources-dir)/, \ @@ -95,9 +95,6 @@ $(intermediates)/nir/nir_opt_algebraic.c: $(nir_opt_algebraic_deps) @mkdir -p $(dir $@) $(hide) $(MESA_PYTHON2) $(nir_opt_algebraic_gen) $< > $@ -LOCAL_GENERATED_SOURCES += $(addprefix $(intermediates)/, \ - $(SPIRV_GENERATED_FILES)) - $(intermediates)/spirv/spirv_info.c: $(LOCAL_PATH)/spirv/spirv_info_c.py $(LOCAL_PATH)/spirv/spirv.core.grammar.json @mkdir -p $(dir $@) - $(hide) $(MESA_PYTHON2) $(LOCAL_PATH)/spirv/spirv_info_c.py $(LOCAL_PATH)/spirv/spirv.core.grammar.json $@ || ($(RM) $@; false) + $(hide) $(MESA_PYTHON2) $^ $@ || ($(RM) $@; false) -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 11/32] i965/miptree: Add a helper for getting the aux usage for texturing
On Wed, Jul 19, 2017 at 02:01:37PM -0700, Jason Ekstrand wrote: > --- > src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 59 > ++- > src/mesa/drivers/dri/i965/intel_mipmap_tree.h | 4 ++ > 2 files changed, 43 insertions(+), 20 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c > b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c > index 2d2a813..0a63178 100644 > --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c > +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c > @@ -2679,6 +2679,33 @@ can_texture_with_ccs(struct brw_context *brw, > return true; > } > > +enum isl_aux_usage > +intel_miptree_texture_aux_usage(struct brw_context *brw, > +struct intel_mipmap_tree *mt, Could be const. > +enum isl_format view_format) > +{ > + switch (mt->aux_usage) { > + case ISL_AUX_USAGE_HIZ: > + if (intel_miptree_sample_with_hiz(brw, mt)) > + return ISL_AUX_USAGE_HIZ; > + break; > + > + case ISL_AUX_USAGE_MCS: > + return ISL_AUX_USAGE_MCS; > + > + case ISL_AUX_USAGE_CCS_D: > + case ISL_AUX_USAGE_CCS_E: > + if (mt->mcs_buf && can_texture_with_ccs(brw, mt, view_format)) > + return ISL_AUX_USAGE_CCS_E; > + break; > + > + default: > + break; > + } > + > + return ISL_AUX_USAGE_NONE; > +} > + > static void > intel_miptree_prepare_texture_slices(struct brw_context *brw, > struct intel_mipmap_tree *mt, > @@ -2687,31 +2714,23 @@ intel_miptree_prepare_texture_slices(struct > brw_context *brw, > uint32_t start_layer, uint32_t > num_layers, > bool *aux_supported_out) > { > - bool aux_supported, clear_supported; > - if (_mesa_is_format_color_format(mt->format)) { > - if (mt->num_samples > 1) { > - aux_supported = clear_supported = true; > - } else { > - aux_supported = can_texture_with_ccs(brw, mt, view_format); > - } > + enum isl_aux_usage aux_usage = > + intel_miptree_texture_aux_usage(brw, mt, view_format); This as well. > + bool clear_supported = aux_usage != ISL_AUX_USAGE_NONE; > > - /* Clear color is specified as ints or floats and the conversion is > - * done by the sampler. If we have a texture view, we would have to > - * perform the clear color conversion manually. Just disable clear > - * color. > - */ > - clear_supported = aux_supported && (mt->format == view_format); > - } else if (mt->format == MESA_FORMAT_S_UINT8) { > - aux_supported = clear_supported = false; > - } else { > - aux_supported = clear_supported = intel_miptree_sample_with_hiz(brw, > mt); > - } > + /* Clear color is specified as ints or floats and the conversion is done > by > +* the sampler. If we have a texture view, we would have to perform the > +* clear color conversion manually. Just disable clear color. > +*/ > + if (mt->format != view_format) > + clear_supported = false; > > intel_miptree_prepare_access(brw, mt, start_level, num_levels, > start_layer, num_layers, > -aux_supported, clear_supported); > +aux_usage != ISL_AUX_USAGE_NONE, > +clear_supported); > if (aux_supported_out) > - *aux_supported_out = aux_supported; > + *aux_supported_out = aux_usage != ISL_AUX_USAGE_NONE; > } > > void > diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h > b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h > index 45ac5df..64ea413 100644 > --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h > +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h > @@ -923,6 +923,10 @@ intel_miptree_access_raw(struct brw_context *brw, >intel_miptree_finish_write(brw, mt, level, layer, 1, false); > } > > +enum isl_aux_usage > +intel_miptree_texture_aux_usage(struct brw_context *brw, > +struct intel_mipmap_tree *mt, > +enum isl_format view_format); > void > intel_miptree_prepare_texture(struct brw_context *brw, >struct intel_mipmap_tree *mt, > -- > 2.5.0.400.gff86faf > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] android: fix spirv_info generation
2017-07-19 15:12 GMT+08:00 Tapani Pälli: > Depending on build order, LOCAL_PATH maybe set or not (and can't > be trusted to have assumed path), change modifies all occurences > of LOCAL_PATH as locally defined COMPILER_PATH instead. > > Signed-off-by: Tapani Pälli > --- > src/compiler/Android.nir.gen.mk | 38 -- > 1 file changed, 20 insertions(+), 18 deletions(-) > > diff --git a/src/compiler/Android.nir.gen.mk b/src/compiler/Android.nir.gen.mk > index 4507ac4..81511de 100644 > --- a/src/compiler/Android.nir.gen.mk > +++ b/src/compiler/Android.nir.gen.mk > @@ -27,6 +27,8 @@ ifeq ($(LOCAL_MODULE_CLASS),) > LOCAL_MODULE_CLASS := STATIC_LIBRARIES > endif > > +COMPILER_PATH := $(MESA_TOP)/src/compiler > + > intermediates := $(call local-generated-sources-dir) > > LOCAL_SRC_FILES := $(LOCAL_SRC_FILES) > @@ -48,48 +50,48 @@ MESA_GEN_NIR_H := $(addprefix $(call > local-generated-sources-dir)/, \ > nir/nir_opcodes.h \ > nir/nir_builder_opcodes.h) > > -nir_builder_opcodes_gen := $(LOCAL_PATH)/nir/nir_builder_opcodes_h.py > +nir_builder_opcodes_gen := $(COMPILER_PATH)/nir/nir_builder_opcodes_h.py > nir_builder_opcodes_deps := \ > - $(LOCAL_PATH)/nir/nir_opcodes.py \ > - $(LOCAL_PATH)/nir/nir_builder_opcodes_h.py > + $(COMPILER_PATH)/nir/nir_opcodes.py \ > + $(COMPILER_PATH)/nir/nir_builder_opcodes_h.py > > $(intermediates)/nir/nir_builder_opcodes.h: $(nir_builder_opcodes_deps) > @mkdir -p $(dir $@) > $(hide) $(MESA_PYTHON2) $(nir_builder_opcodes_gen) $< > $@ > > -nir_constant_expressions_gen := $(LOCAL_PATH)/nir/nir_constant_expressions.py > +nir_constant_expressions_gen := > $(COMPILER_PATH)/nir/nir_constant_expressions.py > nir_constant_expressions_deps := \ > - $(LOCAL_PATH)/nir/nir_opcodes.py \ > - $(LOCAL_PATH)/nir/nir_constant_expressions.py > + $(COMPILER_PATH)/nir/nir_opcodes.py \ > + $(COMPILER_PATH)/nir/nir_constant_expressions.py > > $(intermediates)/nir/nir_constant_expressions.c: > $(nir_constant_expressions_deps) > @mkdir -p $(dir $@) > $(hide) $(MESA_PYTHON2) $(nir_constant_expressions_gen) $< > $@ > > -nir_opcodes_h_gen := $(LOCAL_PATH)/nir/nir_opcodes_h.py > +nir_opcodes_h_gen := $(COMPILER_PATH)/nir/nir_opcodes_h.py > nir_opcodes_h_deps := \ > - $(LOCAL_PATH)/nir/nir_opcodes.py \ > - $(LOCAL_PATH)/nir/nir_opcodes_h.py > + $(COMPILER_PATH)/nir/nir_opcodes.py \ > + $(COMPILER_PATH)/nir/nir_opcodes_h.py > > $(intermediates)/nir/nir_opcodes.h: $(nir_opcodes_h_deps) > @mkdir -p $(dir $@) > $(hide) $(MESA_PYTHON2) $(nir_opcodes_h_gen) $< > $@ > > -$(LOCAL_PATH)/nir/nir.h: $(intermediates)/nir/nir_opcodes.h > +$(COMPILER_PATH)/nir/nir.h: $(intermediates)/nir/nir_opcodes.h > > -nir_opcodes_c_gen := $(LOCAL_PATH)/nir/nir_opcodes_c.py > +nir_opcodes_c_gen := $(COMPILER_PATH)/nir/nir_opcodes_c.py > nir_opcodes_c_deps := \ > - $(LOCAL_PATH)/nir/nir_opcodes.py \ > - $(LOCAL_PATH)/nir/nir_opcodes_c.py > + $(COMPILER_PATH)/nir/nir_opcodes.py \ > + $(COMPILER_PATH)/nir/nir_opcodes_c.py > > $(intermediates)/nir/nir_opcodes.c: $(nir_opcodes_c_deps) > @mkdir -p $(dir $@) > $(hide) $(MESA_PYTHON2) $(nir_opcodes_c_gen) $< > $@ > > -nir_opt_algebraic_gen := $(LOCAL_PATH)/nir/nir_opt_algebraic.py > +nir_opt_algebraic_gen := $(COMPILER_PATH)/nir/nir_opt_algebraic.py > nir_opt_algebraic_deps := \ > - $(LOCAL_PATH)/nir/nir_opt_algebraic.py \ > - $(LOCAL_PATH)/nir/nir_algebraic.py > + $(COMPILER_PATH)/nir/nir_opt_algebraic.py \ > + $(COMPILER_PATH)/nir/nir_algebraic.py > > $(intermediates)/nir/nir_opt_algebraic.c: $(nir_opt_algebraic_deps) > @mkdir -p $(dir $@) > @@ -98,6 +100,6 @@ $(intermediates)/nir/nir_opt_algebraic.c: > $(nir_opt_algebraic_deps) > LOCAL_GENERATED_SOURCES += $(addprefix $(intermediates)/, \ > $(SPIRV_GENERATED_FILES)) > > -$(intermediates)/spirv/spirv_info.c: $(LOCAL_PATH)/spirv/spirv_info_c.py > $(LOCAL_PATH)/spirv/spirv.core.grammar.json > +$(intermediates)/spirv/spirv_info.c: $(COMPILER_PATH)/spirv/spirv_info_c.py > $(COMPILER_PATH)/spirv/spirv.core.grammar.json > @mkdir -p $(dir $@) > - $(hide) $(MESA_PYTHON2) $(LOCAL_PATH)/spirv/spirv_info_c.py > $(LOCAL_PATH)/spirv/spirv.core.grammar.json $@ || ($(RM) $@; false) > + $(hide) $(MESA_PYTHON2) $(COMPILER_PATH)/spirv/spirv_info_c.py > $(COMPILER_PATH)/spirv/spirv.core.grammar.json $@ || ($(RM) $@; false) OK. I see the real problem. The rules to build spirv_info.c are incorrectly to use $(LOCAL_PATH). Basically speaking, $(LOCAL_PATH) can't be used in the recipes[1] since it is always changing. When the recipe rules are executed its value is not you expected. (using it in targets and prerequisites is OK) The typical way to handle it is to use private variable: $(intermediates)/spirv/spirv_info.c:
Re: [Mesa-dev] [PATCH 09/32] i965/miptree: Add support for partially resolving MCS
On Wed, Jul 19, 2017 at 02:01:35PM -0700, Jason Ekstrand wrote: > --- > src/mesa/drivers/dri/i965/brw_blorp.c | 24 > src/mesa/drivers/dri/i965/brw_blorp.h | 5 > src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 40 > +-- > 3 files changed, 67 insertions(+), 2 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c > b/src/mesa/drivers/dri/i965/brw_blorp.c > index efa3b39..ac47f31 100644 > --- a/src/mesa/drivers/dri/i965/brw_blorp.c > +++ b/src/mesa/drivers/dri/i965/brw_blorp.c > @@ -1042,6 +1042,30 @@ brw_blorp_resolve_color(struct brw_context *brw, > struct intel_mipmap_tree *mt, > brw_emit_end_of_pipe_sync(brw, PIPE_CONTROL_RENDER_TARGET_FLUSH); > } > > +void > +brw_blorp_mcs_partial_resolve(struct brw_context *brw, > + struct intel_mipmap_tree *mt, > + uint32_t start_layer, uint32_t num_layers) > +{ > + DBG("%s to mt %p layers %u-%u\n", __FUNCTION__, mt, > + start_layer, start_layer + num_layers - 1); > + > + const mesa_format format = _mesa_get_srgb_format_linear(mt->format); > + enum isl_format isl_format = brw_blorp_to_isl_format(brw, format, true); > + > + struct isl_surf isl_tmp[1]; > + struct blorp_surf surf; > + uint32_t level = 0; > + blorp_surf_for_miptree(brw, , mt, true, false, 0, > + , start_layer, num_layers, isl_tmp); > + > + struct blorp_batch batch; > + blorp_batch_init(>blorp, , brw, 0); > + blorp_mcs_partial_resolve(, , isl_format, > + start_layer, num_layers); > + blorp_batch_finish(); > +} > + > /** > * Perform a HiZ or depth resolve operation. > * > diff --git a/src/mesa/drivers/dri/i965/brw_blorp.h > b/src/mesa/drivers/dri/i965/brw_blorp.h > index 29d5788..c65a68a 100644 > --- a/src/mesa/drivers/dri/i965/brw_blorp.h > +++ b/src/mesa/drivers/dri/i965/brw_blorp.h > @@ -74,6 +74,11 @@ brw_blorp_resolve_color(struct brw_context *brw, > enum blorp_fast_clear_op resolve_op); > > void > +brw_blorp_mcs_partial_resolve(struct brw_context *brw, > + struct intel_mipmap_tree *mt, > + uint32_t start_layer, uint32_t num_layers); > + > +void > intel_hiz_exec(struct brw_context *brw, struct intel_mipmap_tree *mt, > unsigned int level, unsigned int start_layer, > unsigned int num_layers, enum blorp_hiz_op op); > diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c > b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c > index 2521190..1fd39a1 100644 > --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c > +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c > @@ -2323,6 +2323,35 @@ intel_miptree_finish_ccs_write(struct brw_context *brw, > } > > static void > +intel_miptree_prepare_mcs_access(struct brw_context *brw, > + struct intel_mipmap_tree *mt, > + uint32_t layer, > + bool mcs_supported, > + bool fast_clear_supported) > +{ > + switch (intel_miptree_get_aux_state(mt, 0, layer)) { > + case ISL_AUX_STATE_CLEAR: > + case ISL_AUX_STATE_COMPRESSED_CLEAR: > + assert(mcs_supported); > + if (!fast_clear_supported) { > + brw_blorp_mcs_partial_resolve(brw, mt, layer, 1); > + intel_miptree_set_aux_state(brw, mt, 0, layer, 1, > + ISL_AUX_STATE_COMPRESSED_NO_CLEAR); > + } > + break; > + > + case ISL_AUX_STATE_COMPRESSED_NO_CLEAR: > + assert(mcs_supported); > + break; /* Nothing to do */ > + > + case ISL_AUX_STATE_RESOLVED: > + case ISL_AUX_STATE_PASS_THROUGH: > + case ISL_AUX_STATE_AUX_INVALID: > + unreachable("Invalid aux state for MCS"); > + } > +} > + > +static void > intel_miptree_finish_mcs_write(struct brw_context *brw, > struct intel_mipmap_tree *mt, > uint32_t layer, > @@ -2336,10 +2365,10 @@ intel_miptree_finish_mcs_write(struct brw_context > *brw, >break; > > case ISL_AUX_STATE_COMPRESSED_CLEAR: > + case ISL_AUX_STATE_COMPRESSED_NO_CLEAR: >assert(written_with_mcs); >break; /* Nothing to do */ > > - case ISL_AUX_STATE_COMPRESSED_NO_CLEAR: > case ISL_AUX_STATE_RESOLVED: > case ISL_AUX_STATE_PASS_THROUGH: > case ISL_AUX_STATE_AUX_INVALID: > @@ -2499,7 +2528,14 @@ intel_miptree_prepare_access(struct brw_context *brw, > >if (mt->num_samples > 1) { > /* Nothing to do for MSAA */ We should drop this comment now, right? > - assert(aux_supported && fast_clear_supported); > + assert(start_level == 0 && num_levels == 1); > + const uint32_t level_layers = > +miptree_layer_range_length(mt, 0, start_layer, num_layers); > + for (uint32_t a = 0; a <
[Mesa-dev] [PATCH 02/10] mesa: pass the 'caller' function to texstorage()
To be consistent with texturestorage(). Signed-off-by: Samuel Pitoiset--- src/mesa/main/texstorage.c | 20 +++- 1 file changed, 11 insertions(+), 9 deletions(-) diff --git a/src/mesa/main/texstorage.c b/src/mesa/main/texstorage.c index 7519ca2807..99a169e5ff 100644 --- a/src/mesa/main/texstorage.c +++ b/src/mesa/main/texstorage.c @@ -475,7 +475,7 @@ texture_storage(struct gl_context *ctx, GLuint dims, */ static void texstorage(GLuint dims, GLenum target, GLsizei levels, GLenum internalformat, - GLsizei width, GLsizei height, GLsizei depth) + GLsizei width, GLsizei height, GLsizei depth, const char *caller) { struct gl_texture_object *texObj; GET_CURRENT_CONTEXT(ctx); @@ -485,14 +485,13 @@ texstorage(GLuint dims, GLenum target, GLsizei levels, GLenum internalformat, */ if (!legal_texobj_target(ctx, dims, target)) { _mesa_error(ctx, GL_INVALID_ENUM, - "glTexStorage%uD(illegal target=%s)", - dims, _mesa_enum_to_string(target)); + "%s(illegal target=%s)", + caller, _mesa_enum_to_string(target)); return; } if (MESA_VERBOSE & (VERBOSE_API|VERBOSE_TEXTURE)) - _mesa_debug(ctx, "glTexStorage%uD %s %d %s %d %d %d\n", - dims, + _mesa_debug(ctx, "%s %s %d %s %d %d %d\n", caller, _mesa_enum_to_string(target), levels, _mesa_enum_to_string(internalformat), width, height, depth); @@ -500,7 +499,7 @@ texstorage(GLuint dims, GLenum target, GLsizei levels, GLenum internalformat, /* Check the format to make sure it is sized. */ if (!_mesa_is_legal_tex_storage_format(ctx, internalformat)) { _mesa_error(ctx, GL_INVALID_ENUM, - "glTexStorage%uD(internalformat = %s)", dims, + "%s(internalformat = %s)", caller, _mesa_enum_to_string(internalformat)); return; } @@ -562,7 +561,8 @@ void GLAPIENTRY _mesa_TexStorage1D(GLenum target, GLsizei levels, GLenum internalformat, GLsizei width) { - texstorage(1, target, levels, internalformat, width, 1, 1); + texstorage(1, target, levels, internalformat, width, 1, 1, + "glTexStorage1D"); } @@ -570,7 +570,8 @@ void GLAPIENTRY _mesa_TexStorage2D(GLenum target, GLsizei levels, GLenum internalformat, GLsizei width, GLsizei height) { - texstorage(2, target, levels, internalformat, width, height, 1); + texstorage(2, target, levels, internalformat, width, height, 1, + "glTexStorage2D"); } @@ -578,7 +579,8 @@ void GLAPIENTRY _mesa_TexStorage3D(GLenum target, GLsizei levels, GLenum internalformat, GLsizei width, GLsizei height, GLsizei depth) { - texstorage(3, target, levels, internalformat, width, height, depth); + texstorage(3, target, levels, internalformat, width, height, depth, + "glTexStorage3D"); } -- 2.13.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 09/10] mesa: remove duplicated code around framebuffer_renderbuffer()
Signed-off-by: Samuel Pitoiset--- src/mesa/main/fbobject.c | 70 1 file changed, 23 insertions(+), 47 deletions(-) diff --git a/src/mesa/main/fbobject.c b/src/mesa/main/fbobject.c index 7c92df5608..46bc129eff 100644 --- a/src/mesa/main/fbobject.c +++ b/src/mesa/main/fbobject.c @@ -3645,15 +3645,29 @@ _mesa_framebuffer_renderbuffer(struct gl_context *ctx, } static void -framebuffer_renderbuffer(struct gl_context *ctx, - struct gl_framebuffer *fb, - GLenum attachment, - struct gl_renderbuffer *rb, - const char *func) +framebuffer_renderbuffer(struct gl_context *ctx, struct gl_framebuffer *fb, + GLenum attachment, GLenum renderbuffertarget, + GLuint renderbuffer, const char *func) { struct gl_renderbuffer_attachment *att; + struct gl_renderbuffer *rb; bool is_color_attachment; + if (renderbuffertarget != GL_RENDERBUFFER) { + _mesa_error(ctx, GL_INVALID_ENUM, + "%s(renderbuffertarget is not GL_RENDERBUFFER)", func); + return; + } + + if (renderbuffer) { + rb = _mesa_lookup_renderbuffer_err(ctx, renderbuffer, func); + if (!rb) + return; + } else { + /* remove renderbuffer attachment */ + rb = NULL; + } + if (_mesa_is_winsys_fbo(fb)) { /* Can't attach new renderbuffers to a window system framebuffer */ _mesa_error(ctx, GL_INVALID_OPERATION, @@ -3707,7 +3721,6 @@ _mesa_FramebufferRenderbuffer(GLenum target, GLenum attachment, GLuint renderbuffer) { struct gl_framebuffer *fb; - struct gl_renderbuffer *rb; GET_CURRENT_CONTEXT(ctx); fb = get_framebuffer_target(ctx, target); @@ -3718,26 +3731,8 @@ _mesa_FramebufferRenderbuffer(GLenum target, GLenum attachment, return; } - if (renderbuffertarget != GL_RENDERBUFFER) { - _mesa_error(ctx, GL_INVALID_ENUM, - "glFramebufferRenderbuffer(renderbuffertarget is not " - "GL_RENDERBUFFER)"); - return; - } - - if (renderbuffer) { - rb = _mesa_lookup_renderbuffer_err(ctx, renderbuffer, - "glFramebufferRenderbuffer"); - if (!rb) - return; - } - else { - /* remove renderbuffer attachment */ - rb = NULL; - } - - framebuffer_renderbuffer(ctx, fb, attachment, rb, -"glFramebufferRenderbuffer"); + framebuffer_renderbuffer(ctx, fb, attachment, renderbuffertarget, +renderbuffer, "glFramebufferRenderbuffer"); } @@ -3747,7 +3742,6 @@ _mesa_NamedFramebufferRenderbuffer(GLuint framebuffer, GLenum attachment, GLuint renderbuffer) { struct gl_framebuffer *fb; - struct gl_renderbuffer *rb; GET_CURRENT_CONTEXT(ctx); fb = _mesa_lookup_framebuffer_err(ctx, framebuffer, @@ -3755,26 +3749,8 @@ _mesa_NamedFramebufferRenderbuffer(GLuint framebuffer, GLenum attachment, if (!fb) return; - if (renderbuffertarget != GL_RENDERBUFFER) { - _mesa_error(ctx, GL_INVALID_ENUM, - "glNamedFramebufferRenderbuffer(renderbuffertarget is not " - "GL_RENDERBUFFER)"); - return; - } - - if (renderbuffer) { - rb = _mesa_lookup_renderbuffer_err(ctx, renderbuffer, - "glNamedFramebufferRenderbuffer"); - if (!rb) - return; - } - else { - /* remove renderbuffer attachment */ - rb = NULL; - } - - framebuffer_renderbuffer(ctx, fb, attachment, rb, -"glNamedFramebufferRenderbuffer"); + framebuffer_renderbuffer(ctx, fb, attachment, renderbuffertarget, +renderbuffer, "glNamedFramebufferRenderbuffer"); } -- 2.13.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 05/10] mesa: inline remove_array_object()
No need to check if ID is not 0 because _mesa_lookup_vao() already prevents this to happen. Signed-off-by: Samuel Pitoiset--- src/mesa/main/arrayobj.c | 16 +--- 1 file changed, 1 insertion(+), 15 deletions(-) diff --git a/src/mesa/main/arrayobj.c b/src/mesa/main/arrayobj.c index ce0050ace3..5f6450a042 100644 --- a/src/mesa/main/arrayobj.c +++ b/src/mesa/main/arrayobj.c @@ -311,20 +311,6 @@ save_array_object(struct gl_context *ctx, struct gl_vertex_array_object *vao) /** - * Remove the given array object from the array object pool. - * Do not deallocate the array object though. - */ -static void -remove_array_object(struct gl_context *ctx, struct gl_vertex_array_object *vao) -{ - if (vao->Name > 0) { - /* remove from hash table */ - _mesa_HashRemoveLocked(ctx->Array.Objects, vao->Name); - } -} - - -/** * Updates the derived gl_vertex_arrays when a gl_vertex_attrib_array * or a gl_vertex_buffer_binding has changed. */ @@ -504,7 +490,7 @@ _mesa_DeleteVertexArrays(GLsizei n, const GLuint *ids) _mesa_BindVertexArray(0); /* The ID is immediately freed for re-use */ - remove_array_object(ctx, obj); + _mesa_HashRemoveLocked(ctx->Array.Objects, obj->Name); if (ctx->Array.LastLookedUpVAO == obj) _mesa_reference_vao(ctx, >Array.LastLookedUpVAO, NULL); -- 2.13.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 04/10] mesa: tidy up _mesa_DeleteVertexArrays()
Signed-off-by: Samuel Pitoiset--- src/mesa/main/arrayobj.c | 25 - 1 file changed, 12 insertions(+), 13 deletions(-) diff --git a/src/mesa/main/arrayobj.c b/src/mesa/main/arrayobj.c index 17a789f4da..ce0050ace3 100644 --- a/src/mesa/main/arrayobj.c +++ b/src/mesa/main/arrayobj.c @@ -493,19 +493,18 @@ _mesa_DeleteVertexArrays(GLsizei n, const GLuint *ids) for (i = 0; i < n; i++) { struct gl_vertex_array_object *obj = _mesa_lookup_vao(ctx, ids[i]); - if ( obj != NULL ) { -assert( obj->Name == ids[i] ); - -/* If the array object is currently bound, the spec says "the binding - * for that object reverts to zero and the default vertex array - * becomes current." - */ -if ( obj == ctx->Array.VAO ) { - _mesa_BindVertexArray(0); -} - -/* The ID is immediately freed for re-use */ -remove_array_object(ctx, obj); + if (obj) { + assert(obj->Name == ids[i]); + + /* If the array object is currently bound, the spec says "the binding + * for that object reverts to zero and the default vertex array + * becomes current." + */ + if (obj == ctx->Array.VAO) +_mesa_BindVertexArray(0); + + /* The ID is immediately freed for re-use */ + remove_array_object(ctx, obj); if (ctx->Array.LastLookedUpVAO == obj) _mesa_reference_vao(ctx, >Array.LastLookedUpVAO, NULL); -- 2.13.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 08/10] mesa: remove one extra check in _mesa_DeleteTextures()
Already checked above. Signed-off-by: Samuel Pitoiset--- src/mesa/main/texobj.c | 5 - 1 file changed, 5 deletions(-) diff --git a/src/mesa/main/texobj.c b/src/mesa/main/texobj.c index 156a122ac0..e52ad22645 100644 --- a/src/mesa/main/texobj.c +++ b/src/mesa/main/texobj.c @@ -1463,11 +1463,6 @@ _mesa_DeleteTextures( GLsizei n, const GLuint *textures) FLUSH_VERTICES(ctx, 0); /* too complex */ - if (n < 0) { - _mesa_error(ctx, GL_INVALID_VALUE, "glDeleteTextures(n)"); - return; - } - if (!textures) return; -- 2.13.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 10/10] mesa: remove useless assert in _mesa_TextureView()
Already checked in _mesa_choose_texture_format(). Signed-off-by: Samuel Pitoiset--- src/mesa/main/textureview.c | 1 - 1 file changed, 1 deletion(-) diff --git a/src/mesa/main/textureview.c b/src/mesa/main/textureview.c index ed66c17958..cef9caff41 100644 --- a/src/mesa/main/textureview.c +++ b/src/mesa/main/textureview.c @@ -633,7 +633,6 @@ _mesa_TextureView(GLuint texture, GLenum target, GLuint origtexture, texFormat = _mesa_choose_texture_format(ctx, texObj, target, 0, internalformat, GL_NONE, GL_NONE); - assert(texFormat != MESA_FORMAT_NONE); if (texFormat == MESA_FORMAT_NONE) return; newViewNumLevels = MIN2(numlevels, origTexObj->NumLevels - minlevel); -- 2.13.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 03/10] mesa: remove useless assert in texture_storage()
Already checked in _mesa_choose_texture_format(). Signed-off-by: Samuel Pitoiset--- src/mesa/main/texstorage.c | 1 - 1 file changed, 1 deletion(-) diff --git a/src/mesa/main/texstorage.c b/src/mesa/main/texstorage.c index 99a169e5ff..ef4fe58f5e 100644 --- a/src/mesa/main/texstorage.c +++ b/src/mesa/main/texstorage.c @@ -406,7 +406,6 @@ texture_storage(struct gl_context *ctx, GLuint dims, texFormat = _mesa_choose_texture_format(ctx, texObj, target, 0, internalformat, GL_NONE, GL_NONE); - assert(texFormat != MESA_FORMAT_NONE); /* check that width, height, depth are legal for the mipmap level */ dimensionsOK = _mesa_legal_texture_dimensions(ctx, target, 0, -- 2.13.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 01/10] mesa: make _mesa_texture_storage() static
Signed-off-by: Samuel Pitoiset--- src/mesa/main/texstorage.c | 24 src/mesa/main/texstorage.h | 7 --- 2 files changed, 12 insertions(+), 19 deletions(-) diff --git a/src/mesa/main/texstorage.c b/src/mesa/main/texstorage.c index 958c7b7a67..7519ca2807 100644 --- a/src/mesa/main/texstorage.c +++ b/src/mesa/main/texstorage.c @@ -386,12 +386,12 @@ tex_storage_error_check(struct gl_context *ctx, * Helper that does the storage allocation for _mesa_TexStorage1/2/3D() * and _mesa_TextureStorage1/2/3D(). */ -void -_mesa_texture_storage(struct gl_context *ctx, GLuint dims, - struct gl_texture_object *texObj, - GLenum target, GLsizei levels, - GLenum internalformat, GLsizei width, - GLsizei height, GLsizei depth, bool dsa) +static void +texture_storage(struct gl_context *ctx, GLuint dims, +struct gl_texture_object *texObj, +GLenum target, GLsizei levels, +GLenum internalformat, GLsizei width, +GLsizei height, GLsizei depth, bool dsa) { GLboolean sizeOK, dimensionsOK; mesa_format texFormat; @@ -480,7 +480,7 @@ texstorage(GLuint dims, GLenum target, GLsizei levels, GLenum internalformat, struct gl_texture_object *texObj; GET_CURRENT_CONTEXT(ctx); - /* Check target. This is done here so that _mesa_texture_storage + /* Check target. This is done here so that texture_storage * can receive unsized formats. */ if (!legal_texobj_target(ctx, dims, target)) { @@ -509,8 +509,8 @@ texstorage(GLuint dims, GLenum target, GLsizei levels, GLenum internalformat, if (!texObj) return; - _mesa_texture_storage(ctx, dims, texObj, target, levels, - internalformat, width, height, depth, false); + texture_storage(ctx, dims, texObj, target, levels, + internalformat, width, height, depth, false); } @@ -543,7 +543,7 @@ texturestorage(GLuint dims, GLuint texture, GLsizei levels, if (!texObj) return; - /* Check target. This is done here so that _mesa_texture_storage + /* Check target. This is done here so that texture_storage * can receive unsized formats. */ if (!legal_texobj_target(ctx, dims, texObj->Target)) { @@ -553,8 +553,8 @@ texturestorage(GLuint dims, GLuint texture, GLsizei levels, return; } - _mesa_texture_storage(ctx, dims, texObj, texObj->Target, - levels, internalformat, width, height, depth, true); + texture_storage(ctx, dims, texObj, texObj->Target, + levels, internalformat, width, height, depth, true); } diff --git a/src/mesa/main/texstorage.h b/src/mesa/main/texstorage.h index e80a9ff5b9..526c61e851 100644 --- a/src/mesa/main/texstorage.h +++ b/src/mesa/main/texstorage.h @@ -31,13 +31,6 @@ */ /*@{*/ -extern void -_mesa_texture_storage(struct gl_context *ctx, GLuint dims, - struct gl_texture_object *texObj, - GLenum target, GLsizei levels, - GLenum internalformat, GLsizei width, - GLsizei height, GLsizei depth, bool dsa); - /** * Texture width, height and depth check shared with the * multisample variants of TexStorage functions. -- 2.13.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 07/10] mesa: make _mesa_generate_texture_mipmap() static
Signed-off-by: Samuel Pitoiset--- src/mesa/main/genmipmap.c | 12 ++-- src/mesa/main/genmipmap.h | 4 2 files changed, 6 insertions(+), 10 deletions(-) diff --git a/src/mesa/main/genmipmap.c b/src/mesa/main/genmipmap.c index 6021c026f5..be49136aa8 100644 --- a/src/mesa/main/genmipmap.c +++ b/src/mesa/main/genmipmap.c @@ -107,10 +107,10 @@ _mesa_is_valid_generate_texture_mipmap_internalformat(struct gl_context *ctx, * Implements glGenerateMipmap and glGenerateTextureMipmap. * Generates all the mipmap levels below the base level. */ -void -_mesa_generate_texture_mipmap(struct gl_context *ctx, - struct gl_texture_object *texObj, GLenum target, - bool dsa) +static void +generate_texture_mipmap(struct gl_context *ctx, +struct gl_texture_object *texObj, GLenum target, +bool dsa) { struct gl_texture_image *srcImage; const char *suffix = dsa ? "Texture" : ""; @@ -187,7 +187,7 @@ _mesa_GenerateMipmap(GLenum target) if (!texObj) return; - _mesa_generate_texture_mipmap(ctx, texObj, target, false); + generate_texture_mipmap(ctx, texObj, target, false); } /** @@ -209,5 +209,5 @@ _mesa_GenerateTextureMipmap(GLuint texture) return; } - _mesa_generate_texture_mipmap(ctx, texObj, texObj->Target, true); + generate_texture_mipmap(ctx, texObj, texObj->Target, true); } diff --git a/src/mesa/main/genmipmap.h b/src/mesa/main/genmipmap.h index 40b7f3636a..94f7f7a680 100644 --- a/src/mesa/main/genmipmap.h +++ b/src/mesa/main/genmipmap.h @@ -28,10 +28,6 @@ #include "glheader.h" -extern void -_mesa_generate_texture_mipmap(struct gl_context *ctx, - struct gl_texture_object *texObj, GLenum target, - bool dsa); bool _mesa_is_valid_generate_texture_mipmap_target(struct gl_context *ctx, GLenum target); -- 2.13.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 06/10] mesa: inline save_array_object()
No need to check if ID is not 0 because _mesa_HashFindFreeKeyBlock() can't generate this value. Signed-off-by: Samuel Pitoiset--- src/mesa/main/arrayobj.c | 15 +-- 1 file changed, 1 insertion(+), 14 deletions(-) diff --git a/src/mesa/main/arrayobj.c b/src/mesa/main/arrayobj.c index 5f6450a042..6e231156fa 100644 --- a/src/mesa/main/arrayobj.c +++ b/src/mesa/main/arrayobj.c @@ -298,19 +298,6 @@ _mesa_initialize_vao(struct gl_context *ctx, /** - * Add the given array object to the array object pool. - */ -static void -save_array_object(struct gl_context *ctx, struct gl_vertex_array_object *vao) -{ - if (vao->Name > 0) { - /* insert into hash table */ - _mesa_HashInsertLocked(ctx->Array.Objects, vao->Name, vao); - } -} - - -/** * Updates the derived gl_vertex_arrays when a gl_vertex_attrib_array * or a gl_vertex_buffer_binding has changed. */ @@ -546,7 +533,7 @@ gen_vertex_arrays(struct gl_context *ctx, GLsizei n, GLuint *arrays, return; } obj->EverBound = create; - save_array_object(ctx, obj); + _mesa_HashInsertLocked(ctx->Array.Objects, obj->Name, obj); arrays[i] = first + i; } } -- 2.13.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 03/32] i965/blorp: Do flushes around depth resolves
On Wed, Jul 19, 2017 at 02:01:29PM -0700, Jason Ekstrand wrote: > It turns out that if you have rendering in-flight with CCS_E enabled and > you go to do a depth resolve without flushing, the CCS data may never > hit the memory. > --- > src/mesa/drivers/dri/i965/brw_blorp.c | 150 > -- > 1 file changed, 72 insertions(+), 78 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c > b/src/mesa/drivers/dri/i965/brw_blorp.c > index 5335fae..efa3b39 100644 > --- a/src/mesa/drivers/dri/i965/brw_blorp.c > +++ b/src/mesa/drivers/dri/i965/brw_blorp.c > @@ -1079,51 +1079,48 @@ intel_hiz_exec(struct brw_context *brw, struct > intel_mipmap_tree *mt, > __func__, opname, mt, level, start_layer, start_layer + num_layers - > 1); > > /* The following stalls and flushes are only documented to be required for > -* HiZ clear operations. However, they also seem to be required for the > -* HiZ resolve operation which is basically the same as a fast clear only > a > -* different value is written into the HiZ surface. > +* HiZ clear operations. However, they also seem to be required for > +* resolve operations. How would feel putting some of the rational in the commit message here? Sounds valuable. > */ > - if (op == BLORP_HIZ_OP_DEPTH_CLEAR || op == BLORP_HIZ_OP_HIZ_RESOLVE) { > - if (brw->gen == 6) { > - /* From the Sandy Bridge PRM, volume 2 part 1, page 313: > - * > - * "If other rendering operations have preceded this clear, a > - * PIPE_CONTROL with write cache flush enabled and Z-inhibit > - * disabled must be issued before the rectangle primitive used for > - * the depth buffer clear operation. > - */ > - brw_emit_pipe_control_flush(brw, > - PIPE_CONTROL_RENDER_TARGET_FLUSH | > - PIPE_CONTROL_DEPTH_CACHE_FLUSH | > - PIPE_CONTROL_CS_STALL); > - } else if (brw->gen >= 7) { > - /* > - * From the Ivybridge PRM, volume 2, "Depth Buffer Clear": > - * > - * If other rendering operations have preceded this clear, a > - * PIPE_CONTROL with depth cache flush enabled, Depth Stall bit > - * enabled must be issued before the rectangle primitive used for > - * the depth buffer clear operation. > - * > - * Same applies for Gen8 and Gen9. > - * > - * In addition, from the Ivybridge PRM, volume 2, 1.10.4.1 > - * PIPE_CONTROL, Depth Cache Flush Enable: > - * > - * This bit must not be set when Depth Stall Enable bit is set in > - * this packet. > - * > - * This is confirmed to hold for real, HSW gets immediate gpu hangs. > - * > - * Therefore issue two pipe control flushes, one for cache flush and > - * another for depth stall. > - */ > - brw_emit_pipe_control_flush(brw, > - PIPE_CONTROL_DEPTH_CACHE_FLUSH | > - PIPE_CONTROL_CS_STALL); > + if (brw->gen == 6) { > + /* From the Sandy Bridge PRM, volume 2 part 1, page 313: > + * > + * "If other rendering operations have preceded this clear, a > + * PIPE_CONTROL with write cache flush enabled and Z-inhibit > + * disabled must be issued before the rectangle primitive used for > + * the depth buffer clear operation. > + */ > + brw_emit_pipe_control_flush(brw, > + PIPE_CONTROL_RENDER_TARGET_FLUSH | > + PIPE_CONTROL_DEPTH_CACHE_FLUSH | > + PIPE_CONTROL_CS_STALL); > + } else if (brw->gen >= 7) { > + /* > + * From the Ivybridge PRM, volume 2, "Depth Buffer Clear": > + * > + * If other rendering operations have preceded this clear, a > + * PIPE_CONTROL with depth cache flush enabled, Depth Stall bit > + * enabled must be issued before the rectangle primitive used for > + * the depth buffer clear operation. > + * > + * Same applies for Gen8 and Gen9. > + * > + * In addition, from the Ivybridge PRM, volume 2, 1.10.4.1 > + * PIPE_CONTROL, Depth Cache Flush Enable: > + * > + * This bit must not be set when Depth Stall Enable bit is set in > + * this packet. > + * > + * This is confirmed to hold for real, HSW gets immediate gpu hangs. > + * > + * Therefore issue two pipe control flushes, one for cache flush and > + * another for depth stall. > + */ > + brw_emit_pipe_control_flush(brw, > + PIPE_CONTROL_DEPTH_CACHE_FLUSH | > + PIPE_CONTROL_CS_STALL); > > -