Re: [Mesa-dev] [PATCH 1/6] c11: add c11 compatibility wrapper around stdlib.h
On 07/03/15 19:38, Emil Velikov wrote: On 07/03/15 07:23, Jose Fonseca wrote: ... we still didn't eliminate the use of non-portable _MTX_INITIALIZER_NP from Mesa tree gave me pause. The only way I can think about resolving this, is to use call_once() to initialize the mutex, Yes, I'm afraid so. Static initialization of is much simpler than the call_once paradigm, but unfortunately ISO didn't standardize a static initializer for ISO C11 threads.h, and we can't tell whether it will be possible to somehow to provide _MTX_INITIALIZER_NP on top of future C11 threads.h implementations that we might care about. In particular, it is possible that GLIBC's C11 threads.h implementation of mutex will not be just a wrapper around pthreads: https://sourceware.org/ml/libc-alpha/2014-02/msg00782.html https://sourceware.org/bugzilla/show_bug.cgi?id=14092 And if mutexes become opaque binary structures, then it will really be impossible. Maybe that's why ISO didn't do it -- precisely so that C11 mutexes internal representation doesn't need to be part of the ABI, only its size is, and standard lib is free to change its internals, withing that size. but I'm not 100% sure if T2 will sync until T1 call_once's func has returned. How does it sound ? I believe those are precisely the semantics of call_once. ... We can can consider move the c99_foo.h/c11_foo.h them somewhere else (another subdirectory, or util) or renaming them (like u_foo.h). I have no objection on moving the file, but please keep the file name in some form that makes it obvious about the spec compat/wrapping it provides. FYI I'm contemplating on about adding a final wrapper - c99_string.h. It should nuke nearly all of the remaining compiler abstraction that we have around - mapi, egl, gallium, mesa, glsl... -Emil Sounds good to me. I can build-test them if Brian's not available. I wonder if we should remove src/gallium/auxiliary/util/u_snprintf.c too. This implementation of snprintf was necessary when we needed to build gallium for XP kernel mode, which didn't support floating-point support, but nowadays MSVC's _snprintf suits just as well AFAICT. On the other hand, we should be careful about localization, as using the system's snprintf might easily cause floating/integers to formatted weirdly (extra spaces, commas, or different decimal separators. etc), so it's not safe to use it for things beyond user messages/logs... That is, it might be better to override #define snprintf _unlocalized_snprintf #define asprintf _unlocalized_asprintf [...] and provide our own implementations of all these... Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/4] Clover: use get_device_vendor instead of get_vendor
Giuseppe Bilotta giuseppe.bilo...@gmail.com writes: The pipe's get_vendor method returns something more akin to a driver vendor string in most cases, instead of the actual device vendor. Use get_device_vendor instead, which was introduced specifically for this purpose. For this patch: Reviewed-by: Francisco Jerez curroje...@riseup.net --- src/gallium/state_trackers/clover/core/device.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/gallium/state_trackers/clover/core/device.cpp b/src/gallium/state_trackers/clover/core/device.cpp index c3f3b4e..42b45b7 100644 --- a/src/gallium/state_trackers/clover/core/device.cpp +++ b/src/gallium/state_trackers/clover/core/device.cpp @@ -192,7 +192,7 @@ device::device_name() const { std::string device::vendor_name() const { - return pipe-get_vendor(pipe); + return pipe-get_device_vendor(pipe); } enum pipe_shader_ir -- 2.1.2.766.gaa23a90 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] GSoC 2015 Proposal : Porting Glean tests to piglit
Hello, Thanks for the email. I am currently updating the proposal now. Best, Juliet On Mon, Mar 9, 2015 at 12:32 PM, Emil Velikov emil.l.veli...@gmail.com wrote: On 04/03/15 11:21, Juliet Fru wrote: Hello, Here is my proposal for Adding Porting Glean tests to piglit. I'll like to get your comments and tweaks. Thanks, Juliet Porting Glean tests to Piglit framework OPW Proposal https://docs.google.com/document/d/1Y5flvgsJg5s6XUfTP955qKDbhG0ldXBCjGd-YROtTtM/edit?usp=drive_web Hi Juliet, Thank you for the interest :-) Did you link the correct proposal - this one seems to be for OPW ? Here are some examples [1] [2] [3] [4], which you might find inspirational. For your next iteration please send your proposal as plain text in the email body. This way people can comment on it directly and their feedback will be visible to everyone on the list. Cheers, Emil P.S. Hope you've seen Martin's email on the topic[5]. [1] http://www.google-melange.com/gsoc/project/details/google/gsoc2011/steckdenis/5899781826150400 [2] http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20130423/91f152db/attachment.text [3] http://www.cs.usm.maine.edu/~noblesmith/gsoc2013proposal-ext-direct-state-access.txt [4] https://hakzsam.wordpress.com/2014/05/15/google-summer-of-code-2014-proposal-for-x-org-foundation/ [5] http://lists.freedesktop.org/archives/dri-devel/2015-March/078543.html ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] autogen.sh: pass --force to autoreconf, quote ORIGDIR
My passing --force autoreconf will update all the aux files, which would otherwise be ignored if one updates autoconf/automake. Quote the ORIGDIR variable to prevent fall-outs, when it's name contains space. Signed-off-by: Emil Velikov emil.l.veli...@gmail.com --- autogen.sh | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/autogen.sh b/autogen.sh index 626d213..c896097 100755 --- a/autogen.sh +++ b/autogen.sh @@ -6,8 +6,8 @@ test -z $srcdir srcdir=. ORIGDIR=`pwd` cd $srcdir -autoreconf -v --install || exit 1 -cd $ORIGDIR || exit $? +autoreconf --force --verbose --install || exit 1 +cd $ORIGDIR || exit $? if test -z $NOCONFIGURE; then $srcdir/configure $@ -- 2.3.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] GSoC 2015 Proposal : Porting Glean tests to piglit
On 04/03/15 11:21, Juliet Fru wrote: Hello, Here is my proposal for Adding Porting Glean tests to piglit. I'll like to get your comments and tweaks. Thanks, Juliet Porting Glean tests to Piglit framework OPW Proposal https://docs.google.com/document/d/1Y5flvgsJg5s6XUfTP955qKDbhG0ldXBCjGd-YROtTtM/edit?usp=drive_web Hi Juliet, Thank you for the interest :-) Did you link the correct proposal - this one seems to be for OPW ? Here are some examples [1] [2] [3] [4], which you might find inspirational. For your next iteration please send your proposal as plain text in the email body. This way people can comment on it directly and their feedback will be visible to everyone on the list. Cheers, Emil P.S. Hope you've seen Martin's email on the topic[5]. [1] http://www.google-melange.com/gsoc/project/details/google/gsoc2011/steckdenis/5899781826150400 [2] http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20130423/91f152db/attachment.text [3] http://www.cs.usm.maine.edu/~noblesmith/gsoc2013proposal-ext-direct-state-access.txt [4] https://hakzsam.wordpress.com/2014/05/15/google-summer-of-code-2014-proposal-for-x-org-foundation/ [5] http://lists.freedesktop.org/archives/dri-devel/2015-March/078543.html ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] GSoC 2015 Proposal: Porting Glean tests to piglit.
Hello, Here is my proposal: Porting Glean Tests to Piglit GSoC Proposal Contact Informaion:Names:Achere Juliet F. Forchibe E-mail address: juliet...@gmail.com IRC Nick: Jul13t Mentors: Brian Paul, Laura Ekstrand Project Information Porting Glean Tests to the Piglit framework. Brief Summary, Synopsis This project aims at porting the remaining Glean tests to the piglit framework. The Glean test suite is a set of tools for evaluating the quality of an OpenGL implementation and diagnosing problems that occur. However, due to the lack of flexibility and pragmatism in the Glean test framework, the Piglit test framework was developed. My aim is to successfully port the basic test of GL rendering paths, the Conformance test on ARB_occlusion_query extension tests and the two sided stencil extension test to piglit respectively. According to Brian Paul’s work on glean, the path test will verify that basic, trivial OpenGL paths work, setting up pass and fail conditions for each of alpha test, blending, color mask and others making sure they work as expected. Furthermore, the conformance test on ARB_occlusion_query extension tests define the mechanism whereby an application queries the number of pixels drawn by primitives and the two-sided stencil extension tests where stencil-related state may be different for front and back facing polygons. This will help improve the performance of stenciled shadow volume and CSG rendering algorithms. Benefits, Deliverables and Implementation - Port the basic test of GL rendering paths to piglit. - Port the ARB_occlusion query extension tests to piglit - Port the two sided stencil extension tests to piglit Development Schedule May 25 – June 07 ( ~ 2 weeks) - Study ported glean tests to the piglit framework. - Research on the paths test , two sided stencil tests and the ARB_occlusion_query extension tests together with implementation of piglit test. June 08 – June 27th( 3 weeks) 1st milestone, Tpaths.cpp - Port the paths test to piglit, reworking tpaths.h, tpaths.cpp to suit piglit framework. - Testing of tests on piglit and debugging. - Check program with valgrind for memory leaks/errors June 28th – July 5th( 1 week) MID-TERM EVALUATION - Finish clean up of test, documentation and moving to next milestone. - Submit code to piglit’s git repo. July 6th - July 19 ( 2 weeks) 2nd Milestone. - Port the Conformance test on ARB_occlusion_query extension to piglit framework. - Test program with one or more OpenGL drivers. - Check program for memory leaks/errors. July 20 - July 27 ( 1 week) - Post patch for review on the piglit mailing list. - Incorporate review feedback into code and submit code to piglit’s git repo. July 28 – August 16 ( 3 weeks) 3rd milestone - Port the two sided stencil test to piglit - Test program with one or more OpenGL drivers Aug 21 - Aug 28( 1 week) FINAL EVALUATION WEEK. - Check program for memory leaks/errors. - post patch to mailing list for review, correct errors and submit code. Related Work Laura Ekstrand: http://cgit.freedesktop.org/~ldeks/piglit/ Thanks, Juliet ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/3] Revert common: Fix PBOs for 1D_ARRAY.
Hi Emil, The resolve looks good, however I think it would also make sense to cherry pick a44606 to the stable branch. It doesn't do any harm either way but it should be slightly faster and cleaner with that patch as well. Regards, - Neil Emil Velikov emil.l.veli...@gmail.com writes: On 4 March 2015 at 23:15, Emil Velikov emil.l.veli...@gmail.com wrote: On 4 March 2015 at 17:22, Neil Roberts n...@linux.intel.com wrote: This reverts commit 546aba143d13ba3f993ead4cc30b2404abfc0202. I think the changes to the calls to glBlitFramebuffer from this patch are no different to what it was doing previously because it used to set height to 1 before doing the blits. However it was introducing some problems with the blit for layer 0 because this was no longer special cased. It didn't fix problems with the yoffset which needs to be interpreted as a slice offset. I think a better solution would be to modify the original if statement to cope with the yoffset. Neil, if others agree with this revert can you cc stable. Seems that the offending commit already has the tag, plus I've already picked it up :-\ Hi Neil, There was a conflict while picking this for 10.5 due to commit a44606eb816(meta: In pbo_{Get,}TexSubImage don't repeatedly rebind the source tex). I've resolved it as follows, and I'm planning to commit it around Tuesday lunchtime. Do let me know if you have any comments. Thanks Emil diff --cc src/mesa/drivers/common/meta_tex_subimage.c index 9f0c115,1fef79d..000 --- a/src/mesa/drivers/common/meta_tex_subimage.c +++ b/src/mesa/drivers/common/meta_tex_subimage.c @@@ -213,12 -229,7 +219,9 @@@ _mesa_meta_pbo_TexSubImage(struct gl_co GL_COLOR_BUFFER_BIT, GL_NEAREST)) goto fail; -iters = tex_image-TexObject-Target == GL_TEXTURE_1D_ARRAY ? -height : depth; - -for (z = 1; z iters; z++) { +for (z = 1; z depth; z++) { + _mesa_meta_bind_fbo_image(GL_READ_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, +pbo_tex_image, z); _mesa_meta_bind_fbo_image(GL_DRAW_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, tex_image, zoffset + z); @@@ -342,16 -352,9 +344,11 @@@ _mesa_meta_pbo_GetTexSubImage(struct gl GL_COLOR_BUFFER_BIT, GL_NEAREST)) goto fail; -if (tex_image tex_image-TexObject-Target == GL_TEXTURE_1D_ARRAY) - iters = height; -else - iters = depth; - -for (z = 1; z iters; z++) { +for (z = 1; z depth; z++) { _mesa_meta_bind_fbo_image(GL_READ_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, tex_image, zoffset + z); + _mesa_meta_bind_fbo_image(GL_DRAW_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, +pbo_tex_image, z); ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] i965: Do not render primitives in non-zero streams then TF is disabled
Haswell hardware seems to ignore Render Stream Select bits from 3DSTATE_STREAMOUT packet when the SOL stage is disabled even if the PRM says otherwise. Because of this, all primitives are sent down the pipeline for rasterization, which is wrong. If SOL is enabled, Render Stream Select is honored and primitives bound to non-zero streams are discarded after stream output. Since the only purpose of primives sent to non-zero streams is to be recorded by transform feedback, we can simply discard all geometry bound to non-zero streams then transform feedback is disabled to prevent it from ever reaching the rasterization stage. Notice that this patch introduces a small change in the behavior we get when a geometry shader emits more vertices than the maximum declared: before, a vertex that was emitted to a non-zero stream when TF was disabled would still count for the purposes of checking that we don't exceed the maximum number of output vertices declared by the shader. With this change, these vertices are completely ignored and won't increase the output vertex count, making more room for other (hopefully more useful) vertices. Fixes piglit test arb_gpu_shader5-emitstreamvertex_nodraw in Haswell. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83962 --- src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp | 16 1 file changed, 16 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp index 2002ffd..01d8073 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp @@ -476,6 +476,22 @@ vec4_gs_visitor::visit(ir_emit_vertex *ir) { this-current_annotation = emit vertex: safety check; + /* Haswell hardware seems to ignore Render Stream Select bits from +* 3DSTATE_STREAMOUT packet when the SOL stage is disabled even if +* the PRM says otherwise. Because of this, all primitives are sent +* down the pipeline for rasterization, which is wrong. If SOL is +* enabled, Render Stream Select is honored and primitives bound to +* non-zero streams are discarded after stream output. +* +* Since the only purpose of primives sent to non-zero streams is to +* be recorded by transform feedback, we can simply discard all geometry +* bound to these streams when transform feedback is disabled. +*/ + if (brw-is_haswell + ir-stream_id() 0 + this-shader_prog-TransformFeedback.NumVarying == 0) + return; + /* To ensure that we don't output more vertices than the shader specified * using max_vertices, do the logic inside a conditional of the form if * (vertex_count MAX) -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/3] mesa: Simplify some tests in update_array_format()
On Thursday 05 March 2015, Ian Romanick wrote: On 03/05/2015 10:56 AM, Fredrik Höglund wrote: There is no need to check if these extensions are supported here; if the data type is not supported, we will already have returned a GL_INVALID_ENUM error. From where would GL_INVALID_ENUM have been generated? Is that the (typeBit legalTypesMask) == 0x0 check? Yeah. Do we have any piglit tests that verify this? We'd have to find some driver that doesn't support these extensions to try it... I've written a new more extensive piglit test for ARB_dsa that verifies it. The third patch in this series actually fixes a bug uncovered by that test. --- src/mesa/main/varray.c | 20 +--- 1 file changed, 5 insertions(+), 15 deletions(-) diff --git a/src/mesa/main/varray.c b/src/mesa/main/varray.c index 42e7f89..efc1431 100644 --- a/src/mesa/main/varray.c +++ b/src/mesa/main/varray.c @@ -301,17 +301,9 @@ update_array_format(struct gl_context *ctx, *... *• size is BGRA and normalized is FALSE; */ - bool bgra_error = false; - - if (ctx-Extensions.ARB_vertex_type_2_10_10_10_rev) { - if (type != GL_UNSIGNED_INT_2_10_10_10_REV - type != GL_INT_2_10_10_10_REV - type != GL_UNSIGNED_BYTE) -bgra_error = true; - } else if (type != GL_UNSIGNED_BYTE) - bgra_error = true; - - if (bgra_error) { + if (type != GL_UNSIGNED_INT_2_10_10_10_REV + type != GL_INT_2_10_10_10_REV + type != GL_UNSIGNED_BYTE) { _mesa_error(ctx, GL_INVALID_OPERATION, %s(size=GL_BGRA and type=%s), func, _mesa_lookup_enum_by_nr(type)); return false; @@ -331,8 +323,7 @@ update_array_format(struct gl_context *ctx, return false; } - if (ctx-Extensions.ARB_vertex_type_2_10_10_10_rev - (type == GL_UNSIGNED_INT_2_10_10_10_REV || + if ((type == GL_UNSIGNED_INT_2_10_10_10_REV || type == GL_INT_2_10_10_10_REV) size != 4) { _mesa_error(ctx, GL_INVALID_OPERATION, %s(size=%d), func, size); return false; @@ -351,8 +342,7 @@ update_array_format(struct gl_context *ctx, return false; } - if (ctx-Extensions.ARB_vertex_type_10f_11f_11f_rev - type == GL_UNSIGNED_INT_10F_11F_11F_REV size != 3) { + if (type == GL_UNSIGNED_INT_10F_11F_11F_REV size != 3) { _mesa_error(ctx, GL_INVALID_OPERATION, %s(size=%d), func, size); return false; } ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] r300g: Fix the ATI1N swizzle.
This fixes the GL_COMPRESSED_RED_RGTC1 part of piglit's rgtc-teximage-01 test as well as the precision part of Wine's 3dc format test (fd.o bug 89156). The Z component seems to contain a lower precision version of the result, probably a temporary value from the decompression computation. The Y and W component contain different data that depends on the input values as well, but I could not make sense of them (Not that I tried very hard). GL_COMPRESSED_SIGNED_RED_RGTC1 still seems to have precision problems in piglit, and both formats are affected by a compiler bug if they're sampled by the shader with a swizzle other than .xyzw. Wine uses ., which returns random garbage. --- src/gallium/drivers/r300/r300_texture.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/src/gallium/drivers/r300/r300_texture.c b/src/gallium/drivers/r300/r300_texture.c index ffe8c00..340b8fc 100644 --- a/src/gallium/drivers/r300/r300_texture.c +++ b/src/gallium/drivers/r300/r300_texture.c @@ -176,7 +176,9 @@ uint32_t r300_translate_texformat(enum pipe_format format, format != PIPE_FORMAT_RGTC2_UNORM format != PIPE_FORMAT_RGTC2_SNORM format != PIPE_FORMAT_LATC2_UNORM -format != PIPE_FORMAT_LATC2_SNORM) { +format != PIPE_FORMAT_LATC2_SNORM +format != PIPE_FORMAT_RGTC1_UNORM +format != PIPE_FORMAT_LATC1_UNORM) { result |= r300_get_swizzle_combined(desc-swizzle, swizzle_view, TRUE); } else { -- 2.0.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Silence GCC maybe-uninitialized warning.
On Fri, Mar 6, 2015 at 10:10 PM, Vinson Lee v...@freedesktop.org wrote: brw_shader.cpp: In function ‘bool brw_saturate_immediate(brw_reg_type, brw_reg*)’: brw_shader.cpp:618:31: warning: ‘sat_imm.brw_saturate_immediate(brw_reg_type, brw_reg*)::anonymous union::ud’ may be used uninitialized in this function [-Wmaybe-uninitialized] reg-dw1.ud = sat_imm.ud; ^ Signed-off-by: Vinson Lee v...@freedesktop.org --- src/mesa/drivers/dri/i965/brw_shader.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp b/src/mesa/drivers/dri/i965/brw_shader.cpp index f2b4d82..ff0ef4b 100644 --- a/src/mesa/drivers/dri/i965/brw_shader.cpp +++ b/src/mesa/drivers/dri/i965/brw_shader.cpp @@ -584,7 +584,7 @@ brw_saturate_immediate(enum brw_reg_type type, struct brw_reg *reg) unsigned ud; int d; float f; - } imm = { reg-dw1.ud }, sat_imm; + } imm = { reg-dw1.ud }, sat_imm = { 0 }; switch (type) { case BRW_REGISTER_TYPE_UD: -- 2.3.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev Reviewed-by: Anuj Phogat anuj.pho...@gmail.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i915: Fix GCC unused-but-set-variable warning in release build.
On Fri, Mar 6, 2015 at 9:56 PM, Vinson Lee v...@freedesktop.org wrote: i915_fragprog.c: In function ‘i915ValidateFragmentProgram’: i915_fragprog.c:1453:11: warning: variable ‘k’ set but not used [-Wunused-but-set-variable] int k; ^ Signed-off-by: Vinson Lee v...@freedesktop.org --- src/mesa/drivers/dri/i915/i915_fragprog.c | 5 + 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/src/mesa/drivers/dri/i915/i915_fragprog.c b/src/mesa/drivers/dri/i915/i915_fragprog.c index d42da5a..9b00223 100644 --- a/src/mesa/drivers/dri/i915/i915_fragprog.c +++ b/src/mesa/drivers/dri/i915/i915_fragprog.c @@ -1450,8 +1450,6 @@ i915ValidateFragmentProgram(struct i915_context *i915) if (s2 != i915-state.Ctx[I915_CTXREG_LIS2] || s4 != i915-state.Ctx[I915_CTXREG_LIS4]) { - int k; - I915_STATECHANGE(i915, I915_UPLOAD_CTX); /* Must do this *after* statechange, so as not to affect @@ -1471,8 +1469,7 @@ i915ValidateFragmentProgram(struct i915_context *i915) i915-state.Ctx[I915_CTXREG_LIS2] = s2; i915-state.Ctx[I915_CTXREG_LIS4] = s4; - k = intel-vtbl.check_vertex_size(intel, intel-vertex_size); - assert(k); + assert(intel-vtbl.check_vertex_size(intel, intel-vertex_size)); } if (!p-params_uptodate) -- 2.3.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev Reviewed-by: Anuj Phogat anuj.pho...@gmail.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] r300g: Fix the ATI1N swizzle.
On Mon, Mar 9, 2015 at 12:26 PM, Stefan Dösinger stefandoesin...@gmx.at wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Am 2015-03-09 um 17:19 schrieb Ilia Mirkin: It also has the additional problem that it doesn't do the swizzle workaround which apparently is necessary even for single-component textures. Do you mean the change I made in my patch? That part works just fine for the signed one component value. It seemed that a workaround for S3TC values was incorrectly applied to RGTC values, but I could not find information about that in the GPU docs. Maybe the if condition I changed should be modified to whitelist formats rather than blacklist. No, I meant the SNORM formats. It does the snorm-from-unorm fixup, but it doesn't do the swizzle fixup (which seems like it'd be required given your findings about unorm). -ilia ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] r300g: Fix the ATI1N swizzle.
On Mon, Mar 9, 2015 at 11:15 AM, Stefan Dösinger stefandoesin...@gmx.at wrote: This fixes the GL_COMPRESSED_RED_RGTC1 part of piglit's rgtc-teximage-01 test as well as the precision part of Wine's 3dc format test (fd.o bug 89156). This is often identified in the commit message with Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89156 The Z component seems to contain a lower precision version of the result, probably a temporary value from the decompression computation. The Y and W component contain different data that depends on the input values as well, but I could not make sense of them (Not that I tried very hard). GL_COMPRESSED_SIGNED_RED_RGTC1 still seems to have precision problems in piglit, and both formats are affected by a compiler bug if they're I don't suppose you've tried adding RGTC1_SNORM/LATC1_SNORM into that condition? sampled by the shader with a swizzle other than .xyzw. Wine uses ., which returns random garbage. --- src/gallium/drivers/r300/r300_texture.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/src/gallium/drivers/r300/r300_texture.c b/src/gallium/drivers/r300/r300_texture.c index ffe8c00..340b8fc 100644 --- a/src/gallium/drivers/r300/r300_texture.c +++ b/src/gallium/drivers/r300/r300_texture.c @@ -176,7 +176,9 @@ uint32_t r300_translate_texformat(enum pipe_format format, format != PIPE_FORMAT_RGTC2_UNORM format != PIPE_FORMAT_RGTC2_SNORM format != PIPE_FORMAT_LATC2_UNORM -format != PIPE_FORMAT_LATC2_SNORM) { +format != PIPE_FORMAT_LATC2_SNORM +format != PIPE_FORMAT_RGTC1_UNORM +format != PIPE_FORMAT_LATC1_UNORM) { result |= r300_get_swizzle_combined(desc-swizzle, swizzle_view, TRUE); } else { -- 2.0.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/7] main: Cosmetic changes for Texture Buffers.
On Wed, Mar 4, 2015 at 4:41 PM, Laura Ekstrand la...@jlekstrand.net wrote: Adds a useful comment and some whitespace. Fixes an error message. v2: Review from Anuj Phogat - Split rebase of Tex[ture]Buffer[Range] --- src/mesa/main/teximage.c | 12 ++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/src/mesa/main/teximage.c b/src/mesa/main/teximage.c index 706c76b..22574bd 100644 --- a/src/mesa/main/teximage.c +++ b/src/mesa/main/teximage.c @@ -5354,6 +5354,14 @@ _mesa_TexBufferRange(GLenum target, GLenum internalFormat, GLuint buffer, buffer); return; } else { + + /* OpenGL 4.5 core spec (02.02.2015) says in Section 8.9 Buffer + * Textures (PDF page 254): + *If buffer is zero, then any buffer object attached to the buffer + *texture is detached, the values offset and size are ignored and + *the state for offset and size for the buffer texture are reset to + *zero. + */ offset = 0; size = 0; } @@ -5382,8 +5390,7 @@ _mesa_TextureBuffer(GLuint texture, GLenum internalFormat, GLuint buffer) bufObj = NULL; /* Get the texture object by Name. */ - texObj = _mesa_lookup_texture_err(ctx, texture, - glTextureBuffer(texture)); + texObj = _mesa_lookup_texture_err(ctx, texture, glTextureBuffer); if (!texObj) return; @@ -5396,6 +5403,7 @@ _mesa_TextureBuffer(GLuint texture, GLenum internalFormat, GLuint buffer) bufObj, 0, buffer ? -1 : 0, glTextureBuffer); } + static GLboolean is_renderable_texture_format(struct gl_context *ctx, GLenum internalformat) { This hunk is unnecessary. -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] r300g: Fix the ATI1N swizzle.
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hi, Thanks for the quick feedback! Am 2015-03-09 um 16:20 schrieb Ilia Mirkin: I don't suppose you've tried adding RGTC1_SNORM/LATC1_SNORM into that condition? No, because the codepath isn't entered for them at all. There's an if(format != RGTC1_SNORM format != LATC1_SNORM) around this block. I did test if LATC_UNORM and LATC_SNORM still work after my fix. LATC_SNORM is unchanged (broken in the same way as RGTC_SNORM) and LATC_UNORM now has the proper precision like RGTC_UNORM. Stefan -BEGIN PGP SIGNATURE- Version: GnuPG v2 iQIcBAEBAgAGBQJU/cHrAAoJEN0/YqbEcdMwhHcP/0AZDFeAMMK9SGNMoEvmykk/ uDP0Yvlcx1PkRqIhLj1ZXpt4t2rWGEsuYIdaWryIk6qt3na/gl00Arp2TdPWaAEn ab7uFFw4h4SjPg8qdE+3Vfr8WYSyMZmol9eggsHN+OrL5YIf7N1eeYwNXPZKBTZY fRCHT3DTGQeT8y9rgdOkoJjJ2az6ydJjrgcfMjuDuJwfxWO/Blj5X3nshBEF4bLF 5xKgHgx5qGsTlO2zkcCqCfEd0K3rDZhUkfR1MswUbyeFsohaTf1TRJ+LRHEYPY4W bG2raSn8vLuvmD9rSILLlshElrjQN85nMPAbgErIyC8THF54ZJPA0eRO96x19/t+ l/wC8AlXCrSH5aLR7Y2S6HeY805ed5x8lrqKF97lvSEYbg6OnK28YuBnE/qEDskD 5L07M46auyJr78BYV/ifJ1gBO16Jt43sOODT57cYUq570b7nyqNLbOcuJeGKEY4m egZpfvRqJktge0qunRDqU87WT/B0fV+hx3n/qpT17LUIa9v2cSWmT9acEtyaS/Y+ wCpGOFOID6xA+OO1XnoNkPXeJyBggBscbFtnSvHKAJPaDmTUo7TGmKw400nUJYZ9 d6q685116QxL+PuxvHN0z9bjl0rWJmKax3ZsZ09PnPGxjPlSI27q25+nGjgHFf4v i4jqOtDz55aOPeEsnfQ1 =fMPD -END PGP SIGNATURE- ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] r300g: Fix the ATI1N swizzle.
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Am 2015-03-09 um 17:19 schrieb Ilia Mirkin: It also has the additional problem that it doesn't do the swizzle workaround which apparently is necessary even for single-component textures. Do you mean the change I made in my patch? That part works just fine for the signed one component value. It seemed that a workaround for S3TC values was incorrectly applied to RGTC values, but I could not find information about that in the GPU docs. Maybe the if condition I changed should be modified to whitelist formats rather than blacklist. -BEGIN PGP SIGNATURE- Version: GnuPG v2 iQIcBAEBAgAGBQJU/cm9AAoJEN0/YqbEcdMwBrcP/j+hTku7yI7rzCs8VAt0utRv nJvDpcO39ydpIEY7NedzVs5mwPbOqfRHHCqmSkumiuhGFnhYEte3Ywwez9RgHvKG ZzJmlGTiL9WESB3CyBWU7xYiKPlgu7xpP8X6OhtZas86RylCOHb1ahCitQ8JujoX +26u2K9QP4jH9Xkym73mLlwGCSb6Ymnw0IQOsuUick2kCmCBxWOaXIczwNagILyb wz+VSwPRUXVXh2eFl/bEgxktB46pxZ5EQx+1P5reb5RQZ/373raqS1yKeB+u5/EM 6h1Jn24xkm5kQV+fqRzAmypG1WIkW6a4u2owmkaviFta6qWqffP9Z4Pjm7sgjpOu u7G1w30/GNuyb2fuku1SlHBo8CFJPybnacIhASsQGG01OT6z+BxH8CiM/N/d7945 vYpDmV49iBeMpmXDd06H6WKA+8xDO9GDt17+LWBa9qzto8rA2ib1rxkNos9BUhZ8 7xkIxEkRBWBIWTH5ejPZaEvz1Ott/BBffMtb43L7UJFaLdnimKCU0c2s19zuw/ht NaYoBlqEkW3tdPPqIsF5rnkvO91eONZhdnvFLpbZ/Fj4zAyHkyzPMRRISjvllpJV XiqldFUlLyfiH1dt6ruDnQVNAKx3CGrMjjuQBW37+gIAthbRuX56mxQHvg+QrGDo PY6qCecacHJorDMSY/kw =gsHw -END PGP SIGNATURE- ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] r300g: Fix the ATI1N swizzle.
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Am 2015-03-09 um 16:53 schrieb Stefan Dösinger: I did test if LATC_UNORM and LATC_SNORM still work after my fix. LATC_SNORM is unchanged (broken in the same way as RGTC_SNORM) and LATC_UNORM now has the proper precision like RGTC_UNORM. I think the reason why _SNORM is broken is that the way the driver tries to emulate them is conceptually broken. It handles them like the _UNORM variant, but afterwards does a red = red 0.5 ? red * 2 - 2 : red * 2. That may work for an uncompressed signed format, but it doesn't take into account the block decompression of RGTC1. For example consider a block that has red1 = 0x7f and red2 = 0x80. As an unsigned value, red2 is the bigger value(127 vs 128). If it is a signed value, red1 is bigger (127 vs -128). The only way this will work is if the driver adds +128 to red1 and red2 for each block before loading (sucks if it's in a PBO) and after sampling does red = red * 2.0 - 1.0. And I'm not even sure if that will work. It will at least break exact zeros. Note that I don't care about the signed RGTC formats. Wine on this card will only ever need the unsigned ones, as they match ATI1N and ATI2N in d3d. -BEGIN PGP SIGNATURE- Version: GnuPG v2 iQIcBAEBAgAGBQJU/cYcAAoJEN0/YqbEcdMwdygQAIlgA+nSg42o/7xAOogeOHeq IUf/3WKz0GVAhlA+wbli4UoxZgALUHf5PxWgshBvEZZgpzoEYAqpfTVLQm3cUXDd p3eQkDyGucttgnt1zC5q1jNjkP2io114XRqcTwbMHmMJRfZavzNRK805UBuy1Jd4 sOuxsRT7cTnI+iCkbfqHBdyHo2OSegJ0FDpj35aGzpsc3cDQH9oW/jchryoQJo6V luf3KvdyP5QQlmglp8I2V28n7uBypwVZywAxKg4jR00IalZXCXt/vj2SFBojDQal z2Xr9xSz0ccoV+yFutbRwe8wuoUwTBpOFrpnZnx1PQ1RoG34plpgdLg5AVn/t8+A 5UJv5Op51y0KmWqDGadlGV+8RlpdDWCCV6dOdDsOJbrS1/gOJj3Z+VI8rdrrb4Tt n/eY45OWoF3E0v73aMRLdr220wjgFWcgv7Je16xiER4oSemkCfSsl3kKD7RNy8rS oNhd6VL18UcefgrssVLptWYI/M7M1Zs6tbtPmh6WIFjFFpikzIjX51UBrIppCIxr YKZQ0HtML5PJ9JtdHJR6kjqCFKpNO5ZMZtY5HFOuSD+SAu0jnYpTqe/X0etrCy9r PAKwuCLaOqciqANkhYQBq9quwvv9HwDa9huI0GxSk5xkEd5J4eTqxeKPAURkGx4Y SBzv2N9WDEDhnGuzYgs9 =8Wzg -END PGP SIGNATURE- ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] Add macro for unused function attribute.
On 07/03/15 22:09, Vinson Lee wrote: Suggested-by: Emil Velikov emil.l.veli...@gmail.com Signed-off-by: Vinson Lee v...@freedesktop.org Reviewed-by: Emil Velikov emil.l.veli...@gmail.com Looks great. Thanks. Emil --- configure.ac | 1 + scons/gallium.py | 1 + src/util/macros.h | 6 ++ 3 files changed, 8 insertions(+) diff --git a/configure.ac b/configure.ac index 90c7737..2954f80 100644 --- a/configure.ac +++ b/configure.ac @@ -195,6 +195,7 @@ AX_GCC_FUNC_ATTRIBUTE([flatten]) AX_GCC_FUNC_ATTRIBUTE([format]) AX_GCC_FUNC_ATTRIBUTE([malloc]) AX_GCC_FUNC_ATTRIBUTE([packed]) +AX_GCC_FUNC_ATTRIBUTE([unused]) AM_CONDITIONAL([GEN_ASM_OFFSETS], test x$GEN_ASM_OFFSETS = xyes) diff --git a/scons/gallium.py b/scons/gallium.py index 7533f06..b162089 100755 --- a/scons/gallium.py +++ b/scons/gallium.py @@ -369,6 +369,7 @@ def generate(env): 'HAVE___BUILTIN_FFS', 'HAVE___BUILTIN_FFSLL', 'HAVE_FUNC_ATTRIBUTE_FLATTEN', +'HAVE_FUNC_ATTRIBUTE_UNUSED', # GCC 3.0 'HAVE_FUNC_ATTRIBUTE_FORMAT', 'HAVE_FUNC_ATTRIBUTE_PACKED', diff --git a/src/util/macros.h b/src/util/macros.h index 63daba3..6c7bda7 100644 --- a/src/util/macros.h +++ b/src/util/macros.h @@ -176,5 +176,11 @@ do { \ # endif #endif +#ifdef HAVE_FUNC_ATTRIBUTE_UNUSED +#define UNUSED __attribute__((unused)) +#else +#define UNUSED +#endif + #endif /* UTIL_MACROS_H */ ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] autogen.sh: pass --force to autoreconf, quote ORIGDIR
On Mon, Mar 9, 2015 at 4:52 AM, Emil Velikov emil.l.veli...@gmail.com wrote: My passing --force autoreconf will update all the aux files, which would s/My/By/ otherwise be ignored if one updates autoconf/automake. Quote the ORIGDIR variable to prevent fall-outs, when it's name contains s/it's/its/ space. Signed-off-by: Emil Velikov emil.l.veli...@gmail.com --- autogen.sh | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/autogen.sh b/autogen.sh index 626d213..c896097 100755 --- a/autogen.sh +++ b/autogen.sh @@ -6,8 +6,8 @@ test -z $srcdir srcdir=. ORIGDIR=`pwd` cd $srcdir -autoreconf -v --install || exit 1 -cd $ORIGDIR || exit $? +autoreconf --force --verbose --install || exit 1 +cd $ORIGDIR || exit $? if test -z $NOCONFIGURE; then $srcdir/configure $@ -- 2.3.1 Reviewed-by: Matt Turner matts...@gmail.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 89477] include/no_extern_c.h:47:1: error: template with C linkage
https://bugs.freedesktop.org/show_bug.cgi?id=89477 Mark Janes mark.a.ja...@intel.com changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #3 from Mark Janes mark.a.ja...@intel.com --- This was pushed to master. -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 0/7] v2 of Tex[ture]Buffer[Range] functions
On Wed, Mar 4, 2015 at 4:41 PM, Laura Ekstrand la...@jlekstrand.net wrote: This divides a major rework of Tex[ture]Buffer[Range] into multiple patches as recommended by Anuj Phogat. Laura Ekstrand (7): main: Add utility function _mesa_lookup_bufferobj_err. main: Use _mesa_lookup_bufferobj_err to simplify Tex[ture]Buffer[Range]. main: Refactor _mesa_texture_buffer_range. main: Cosmetic changes for Texture Buffers. main: Add check_texture_buffer_range. main: Add check_texture_buffer_target. main: Add entry point for TextureBufferRange. src/mapi/glapi/gen/ARB_direct_state_access.xml | 8 + src/mesa/main/bufferobj.c | 19 ++ src/mesa/main/bufferobj.h | 4 + src/mesa/main/tests/dispatch_sanity.cpp| 1 + src/mesa/main/teximage.c | 241 ++--- src/mesa/main/teximage.h | 10 +- 6 files changed, 210 insertions(+), 73 deletions(-) -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev Series is: Reviewed-by: Anuj Phogat anuj.pho...@gmail.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 0/6] v2 of Compressed Textures Cube Map Support
On Mon, Mar 9, 2015 at 11:09 AM, Anuj Phogat anuj.pho...@gmail.com wrote: On Wed, Mar 4, 2015 at 3:44 PM, Laura Ekstrand la...@jlekstrand.net wrote: This cleans up ARB_direct_state_access texture cube map functions (mostly in response to reviews from Anuj Phogat). Laura Ekstrand (6): main: _mesa_cube_level_complete checks NumLayers. main: Remove redundant NumLayers checks. main: Remove redundant copy of cube map block comment in GetTextureImage. main: assert(texImage) in ARB_DSA texture cube map functions. main: Add TEXTURE_CUBE_MAP support for glCompressedTextureSubImage3D. main: Checking for cube completeness in GetCompressedTextureImage. src/mesa/main/texgetimage.c | 61 --- src/mesa/main/teximage.c| 177 +--- src/mesa/main/teximage.h| 3 +- src/mesa/main/texobj.c | 4 + 4 files changed, 156 insertions(+), 89 deletions(-) -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev Reviewed-by: Anuj Phogat anuj.pho...@gmail.com For the whole series. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] r300g: Fix the ATI1N swizzle.
On Mon, Mar 9, 2015 at 12:11 PM, Stefan Dösinger stefandoesin...@gmx.at wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Am 2015-03-09 um 16:53 schrieb Stefan Dösinger: I did test if LATC_UNORM and LATC_SNORM still work after my fix. LATC_SNORM is unchanged (broken in the same way as RGTC_SNORM) and LATC_UNORM now has the proper precision like RGTC_UNORM. I think the reason why _SNORM is broken is that the way the driver tries to emulate them is conceptually broken. It handles them like the _UNORM variant, but afterwards does a red = red 0.5 ? red * 2 - 2 : red * 2. Contrary to the comments, it appears that it actually uses 2.35 instead of 2.0? Haven't done the math, but I guess it makes some cases work out better... That may work for an uncompressed signed format, but it doesn't take into account the block decompression of RGTC1. For example consider a block that has red1 = 0x7f and red2 = 0x80. As an unsigned value, red2 is the bigger value(127 vs 128). If it is a signed value, red1 is bigger (127 vs -128). The only way this will work is if the driver adds +128 to red1 and red2 for each block before loading (sucks if it's in a PBO) and after sampling does red = red * 2.0 - 1.0. And I'm not even sure if that will work. It will at least break exact zeros. It also has the additional problem that it doesn't do the swizzle workaround which apparently is necessary even for single-component textures. Note that I don't care about the signed RGTC formats. Wine on this card will only ever need the unsigned ones, as they match ATI1N and ATI2N in d3d. Makes sense. BTW, I'm in no way knowledgeable about r300... just seemed like an interesting bit of code to glance at. Hopefully one of the AMD guys will be able to look at the actual patch. Cheers, -ilia ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/7] main: Cosmetic changes for Texture Buffers.
On Mon, Mar 9, 2015 at 9:43 AM, Laura Ekstrand la...@jlekstrand.net wrote: I'm confused which hunk you talking about. Can you be more specific? On Mon, Mar 9, 2015 at 8:47 AM, Anuj Phogat anuj.pho...@gmail.com wrote: On Wed, Mar 4, 2015 at 4:41 PM, Laura Ekstrand la...@jlekstrand.net wrote: Adds a useful comment and some whitespace. Fixes an error message. v2: Review from Anuj Phogat - Split rebase of Tex[ture]Buffer[Range] --- src/mesa/main/teximage.c | 12 ++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/src/mesa/main/teximage.c b/src/mesa/main/teximage.c index 706c76b..22574bd 100644 --- a/src/mesa/main/teximage.c +++ b/src/mesa/main/teximage.c @@ -5354,6 +5354,14 @@ _mesa_TexBufferRange(GLenum target, GLenum internalFormat, GLuint buffer, buffer); return; } else { + + /* OpenGL 4.5 core spec (02.02.2015) says in Section 8.9 Buffer + * Textures (PDF page 254): + *If buffer is zero, then any buffer object attached to the buffer + *texture is detached, the values offset and size are ignored and + *the state for offset and size for the buffer texture are reset to + *zero. + */ offset = 0; size = 0; } @@ -5382,8 +5390,7 @@ _mesa_TextureBuffer(GLuint texture, GLenum internalFormat, GLuint buffer) bufObj = NULL; /* Get the texture object by Name. */ - texObj = _mesa_lookup_texture_err(ctx, texture, - glTextureBuffer(texture)); + texObj = _mesa_lookup_texture_err(ctx, texture, glTextureBuffer); if (!texObj) return; @@ -5396,6 +5403,7 @@ _mesa_TextureBuffer(GLuint texture, GLenum internalFormat, GLuint buffer) bufObj, 0, buffer ? -1 : 0, glTextureBuffer); } + I meant this extra new line here. It's a nitpick. Up to you if you want to keep it. static GLboolean is_renderable_texture_format(struct gl_context *ctx, GLenum internalformat) { This hunk is unnecessary. -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 0/6] v2 of Compressed Textures Cube Map Support
On Wed, Mar 4, 2015 at 3:44 PM, Laura Ekstrand la...@jlekstrand.net wrote: This cleans up ARB_direct_state_access texture cube map functions (mostly in response to reviews from Anuj Phogat). Laura Ekstrand (6): main: _mesa_cube_level_complete checks NumLayers. main: Remove redundant NumLayers checks. main: Remove redundant copy of cube map block comment in GetTextureImage. main: assert(texImage) in ARB_DSA texture cube map functions. main: Add TEXTURE_CUBE_MAP support for glCompressedTextureSubImage3D. main: Checking for cube completeness in GetCompressedTextureImage. src/mesa/main/texgetimage.c | 61 --- src/mesa/main/teximage.c| 177 +--- src/mesa/main/teximage.h| 3 +- src/mesa/main/texobj.c | 4 + 4 files changed, 156 insertions(+), 89 deletions(-) -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev Reviewed-by: Anuj Phogat anuj.pho...@gmail.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2] i965/skl: Fix the order of the arguments for the LD sampler message
In Skylake the order of the arguments for sample messages with the LD type are u, v, lod, r whereas previously they were u, lod, v, r. This fixes 144 Piglit tests including ones that directly use texelFetch and also some using the meta stencil blit path which appears to use texelFetch in its shader. v2: Fix sampling 1D textures --- I realised that v1 of the patch would end up putting the lod in the wrong argument for 1D textures so here is a v2. This time I have run it through a full Piglit run and it doesn't cause any regressions. src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 15 +-- 1 file changed, 13 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp index 6b48f70..287ee47 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp @@ -1742,15 +1742,26 @@ fs_visitor::emit_texture_gen7(ir_texture_opcode op, fs_reg dst, length++; break; case ir_txf: - /* Unfortunately, the parameters for LD are intermixed: u, lod, v, r. */ + /* Unfortunately, the parameters for LD are intermixed: u, lod, v, r. + * On Gen9 they are u, v, lod, r + */ + emit(MOV(retype(sources[length], BRW_REGISTER_TYPE_D), coordinate)); coordinate = offset(coordinate, 1); length++; + if (brw-gen = 9) { + if (coord_components = 2) { +emit(MOV(retype(sources[length], BRW_REGISTER_TYPE_D), coordinate)); +coordinate = offset(coordinate, 1); + } + length++; + } + emit(MOV(retype(sources[length], BRW_REGISTER_TYPE_D), lod)); length++; - for (int i = 1; i coord_components; i++) { + for (int i = brw-gen = 9 ? 2 : 1; i coord_components; i++) { emit(MOV(retype(sources[length], BRW_REGISTER_TYPE_D), coordinate)); coordinate = offset(coordinate, 1); length++; -- 1.9.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/7] main: Cosmetic changes for Texture Buffers.
I'm confused which hunk you talking about. Can you be more specific? On Mon, Mar 9, 2015 at 8:47 AM, Anuj Phogat anuj.pho...@gmail.com wrote: On Wed, Mar 4, 2015 at 4:41 PM, Laura Ekstrand la...@jlekstrand.net wrote: Adds a useful comment and some whitespace. Fixes an error message. v2: Review from Anuj Phogat - Split rebase of Tex[ture]Buffer[Range] --- src/mesa/main/teximage.c | 12 ++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/src/mesa/main/teximage.c b/src/mesa/main/teximage.c index 706c76b..22574bd 100644 --- a/src/mesa/main/teximage.c +++ b/src/mesa/main/teximage.c @@ -5354,6 +5354,14 @@ _mesa_TexBufferRange(GLenum target, GLenum internalFormat, GLuint buffer, buffer); return; } else { + + /* OpenGL 4.5 core spec (02.02.2015) says in Section 8.9 Buffer + * Textures (PDF page 254): + *If buffer is zero, then any buffer object attached to the buffer + *texture is detached, the values offset and size are ignored and + *the state for offset and size for the buffer texture are reset to + *zero. + */ offset = 0; size = 0; } @@ -5382,8 +5390,7 @@ _mesa_TextureBuffer(GLuint texture, GLenum internalFormat, GLuint buffer) bufObj = NULL; /* Get the texture object by Name. */ - texObj = _mesa_lookup_texture_err(ctx, texture, - glTextureBuffer(texture)); + texObj = _mesa_lookup_texture_err(ctx, texture, glTextureBuffer); if (!texObj) return; @@ -5396,6 +5403,7 @@ _mesa_TextureBuffer(GLuint texture, GLenum internalFormat, GLuint buffer) bufObj, 0, buffer ? -1 : 0, glTextureBuffer); } + static GLboolean is_renderable_texture_format(struct gl_context *ctx, GLenum internalformat) { This hunk is unnecessary. -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/9] i965/nir: Optimize after nir_lower_var_copies().
LGTM Reviewed-by: Jason Ekstrand jason.ekstr...@intel.com On Mon, Mar 9, 2015 at 1:58 AM, Kenneth Graunke kenn...@whitecape.org wrote: Array variable copy splitting generates a bunch of stuff we want to clean up before proceeding. Signed-off-by: Kenneth Graunke kenn...@whitecape.org Cc: Jason Ekstrand ja...@jlekstrand.net --- src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 3 +++ 1 file changed, 3 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp index 249e59c..fbdfc22 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp @@ -102,6 +102,9 @@ fs_visitor::emit_nir_code() nir_lower_var_copies(nir); nir_validate_shader(nir); + /* Get rid of split copies */ + nir_optimize(nir); + nir_lower_io(nir); nir_validate_shader(nir); -- 2.2.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 12/15] docs: mark GL_AMD_performance_monitor for the 10.6.0 release
GL_AMD_performance_monitor is supported by nvc0, svga, freedreno, r600 and radeonsi. Signed-off-by: Samuel Pitoiset samuel.pitoi...@gmail.com --- docs/relnotes/10.6.0.html | 1 + 1 file changed, 1 insertion(+) diff --git a/docs/relnotes/10.6.0.html b/docs/relnotes/10.6.0.html index a396109..596a236 100644 --- a/docs/relnotes/10.6.0.html +++ b/docs/relnotes/10.6.0.html @@ -50,6 +50,7 @@ Note: some of the new features are only available with certain drivers. liGL_ARB_instanced_arrays on freedreno/li liGL_ARB_pipeline_statistics_query on i965, nv50, nvc0, r600, radeonsi, softpipe/li liGL_ARB_draw_indirect, GL_ARB_multi_draw_indirect on r600/li +liGL_AMD_performance_monitor on nvc0, r600, radeonsi, svga, freedreno/li /ul h2Bug fixes/h2 -- 2.3.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 07/15] gallium: add util_get_driver_query_group_info
On 03/09/2015 10:43 PM, Marek Olšák wrote: It would be better to add this function to u_helpers.c/.h instead of adding new files. Mmh, I'll probably introduce other functions related to queries when nouveau-perfkit will be ready. Are you sure it's a good idea to drop this file? Marek On Mon, Mar 9, 2015 at 10:09 PM, Samuel Pitoiset samuel.pitoi...@gmail.com wrote: This function can be used to get a generic group of driver-specific queries when a driver doesn't expose any groups. Signed-off-by: Samuel Pitoiset samuel.pitoi...@gmail.com --- src/gallium/auxiliary/Makefile.sources | 1 + src/gallium/auxiliary/util/u_query.c | 50 ++ src/gallium/auxiliary/util/u_query.h | 45 ++ 3 files changed, 96 insertions(+) create mode 100644 src/gallium/auxiliary/util/u_query.c create mode 100644 src/gallium/auxiliary/util/u_query.h diff --git a/src/gallium/auxiliary/Makefile.sources b/src/gallium/auxiliary/Makefile.sources index b7174d6..8af3c34 100644 --- a/src/gallium/auxiliary/Makefile.sources +++ b/src/gallium/auxiliary/Makefile.sources @@ -265,6 +265,7 @@ C_SOURCES := \ util/u_prim.h \ util/u_pstipple.c \ util/u_pstipple.h \ + util/u_query.c \ util/u_range.h \ util/u_rect.h \ util/u_resource.c \ diff --git a/src/gallium/auxiliary/util/u_query.c b/src/gallium/auxiliary/util/u_query.c new file mode 100644 index 000..bb27ca0 --- /dev/null +++ b/src/gallium/auxiliary/util/u_query.c @@ -0,0 +1,50 @@ +/** + * + * Copyright 2015 Samuel Pitoiset samuel.pitoi...@gmail.com + * All Rights Reserved. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the + * Software), to deal in the Software without restriction, including + * without limitation the rights to use, copy, modify, merge, publish, + * distribute, sub license, and/or sell copies of the Software, and to + * permit persons to whom the Software is furnished to do so, subject to + * the following conditions: + * + * The above copyright notice and this permission notice (including the + * next paragraph) shall be included in all copies or substantial portions + * of the Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS + * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. + * IN NO EVENT SHALL THE AUTHORS AND/OR THEIR SUPPLIERS BE LIABLE FOR + * ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, + * TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE + * SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + * + **/ + +#include util/u_memory.h +#include util/u_query.h + +/** + * This function is used to get a generic group of driver-specific queries. + */ +int +util_get_driver_query_group_info(unsigned index, unsigned num_queries, + struct pipe_driver_query_group_info *info) +{ + struct pipe_driver_query_group_info list[] = { + {Driver queries, num_queries, num_queries} + }; + + if (!info) + return ARRAY_SIZE(list); + + if (index = ARRAY_SIZE(list)) + return 0; + + *info = list[index]; + return 1; +} diff --git a/src/gallium/auxiliary/util/u_query.h b/src/gallium/auxiliary/util/u_query.h new file mode 100644 index 000..05b9be9 --- /dev/null +++ b/src/gallium/auxiliary/util/u_query.h @@ -0,0 +1,45 @@ +/** + * + * Copyright 2015 Samuel Pitoiset samuel.pitoi...@gmail.com + * All Rights Reserved. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the + * Software), to deal in the Software without restriction, including + * without limitation the rights to use, copy, modify, merge, publish, + * distribute, sub license, and/or sell copies of the Software, and to + * permit persons to whom the Software is furnished to do so, subject to + * the following conditions: + * + * The above copyright notice and this permission notice (including the + * next paragraph) shall be included in all copies or substantial portions + * of the Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS + * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. + * IN NO EVENT SHALL THE AUTHORS AND/OR THEIR SUPPLIERS BE LIABLE FOR + * ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, + * TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE + * SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
Re: [Mesa-dev] [PATCH 07/15] gallium: add util_get_driver_query_group_info
On 03/09/2015 11:00 PM, Marek Olšák wrote: If you plan to add more functions, this file can stay. Yes, it's my plan. Marek On Mon, Mar 9, 2015 at 10:54 PM, Samuel Pitoiset samuel.pitoi...@gmail.com wrote: On 03/09/2015 10:43 PM, Marek Olšák wrote: It would be better to add this function to u_helpers.c/.h instead of adding new files. Mmh, I'll probably introduce other functions related to queries when nouveau-perfkit will be ready. Are you sure it's a good idea to drop this file? Marek On Mon, Mar 9, 2015 at 10:09 PM, Samuel Pitoiset samuel.pitoi...@gmail.com wrote: This function can be used to get a generic group of driver-specific queries when a driver doesn't expose any groups. Signed-off-by: Samuel Pitoiset samuel.pitoi...@gmail.com --- src/gallium/auxiliary/Makefile.sources | 1 + src/gallium/auxiliary/util/u_query.c | 50 ++ src/gallium/auxiliary/util/u_query.h | 45 ++ 3 files changed, 96 insertions(+) create mode 100644 src/gallium/auxiliary/util/u_query.c create mode 100644 src/gallium/auxiliary/util/u_query.h diff --git a/src/gallium/auxiliary/Makefile.sources b/src/gallium/auxiliary/Makefile.sources index b7174d6..8af3c34 100644 --- a/src/gallium/auxiliary/Makefile.sources +++ b/src/gallium/auxiliary/Makefile.sources @@ -265,6 +265,7 @@ C_SOURCES := \ util/u_prim.h \ util/u_pstipple.c \ util/u_pstipple.h \ + util/u_query.c \ util/u_range.h \ util/u_rect.h \ util/u_resource.c \ diff --git a/src/gallium/auxiliary/util/u_query.c b/src/gallium/auxiliary/util/u_query.c new file mode 100644 index 000..bb27ca0 --- /dev/null +++ b/src/gallium/auxiliary/util/u_query.c @@ -0,0 +1,50 @@ +/** + * + * Copyright 2015 Samuel Pitoiset samuel.pitoi...@gmail.com + * All Rights Reserved. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the + * Software), to deal in the Software without restriction, including + * without limitation the rights to use, copy, modify, merge, publish, + * distribute, sub license, and/or sell copies of the Software, and to + * permit persons to whom the Software is furnished to do so, subject to + * the following conditions: + * + * The above copyright notice and this permission notice (including the + * next paragraph) shall be included in all copies or substantial portions + * of the Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS + * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. + * IN NO EVENT SHALL THE AUTHORS AND/OR THEIR SUPPLIERS BE LIABLE FOR + * ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, + * TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE + * SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + * + **/ + +#include util/u_memory.h +#include util/u_query.h + +/** + * This function is used to get a generic group of driver-specific queries. + */ +int +util_get_driver_query_group_info(unsigned index, unsigned num_queries, + struct pipe_driver_query_group_info *info) +{ + struct pipe_driver_query_group_info list[] = { + {Driver queries, num_queries, num_queries} + }; + + if (!info) + return ARRAY_SIZE(list); + + if (index = ARRAY_SIZE(list)) + return 0; + + *info = list[index]; + return 1; +} diff --git a/src/gallium/auxiliary/util/u_query.h b/src/gallium/auxiliary/util/u_query.h new file mode 100644 index 000..05b9be9 --- /dev/null +++ b/src/gallium/auxiliary/util/u_query.h @@ -0,0 +1,45 @@ +/** + * + * Copyright 2015 Samuel Pitoiset samuel.pitoi...@gmail.com + * All Rights Reserved. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the + * Software), to deal in the Software without restriction, including + * without limitation the rights to use, copy, modify, merge, publish, + * distribute, sub license, and/or sell copies of the Software, and to + * permit persons to whom the Software is furnished to do so, subject to + * the following conditions: + * + * The above copyright notice and this permission notice (including the + * next paragraph) shall be included in all copies or substantial portions + * of the Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS + * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. + * IN NO EVENT SHALL THE AUTHORS AND/OR THEIR
[Mesa-dev] [PATCH] [v2] meta: Plug memory leak
It looks like this has existed since commit f5a477ab76b6e0b268387699cd2253a43db0dfae Author: Ian Romanick ian.d.roman...@intel.com Date: Mon Dec 16 11:54:08 2013 -0800 meta: Refactor shader generation code out of mipmap generation path Valgrind was complaining on fbo-generatemipmap-formats v2: Instead, do the allocation after the early return block (v2) Cc: Ian Romanick ian.d.roman...@intel.com Cc: Brian Paul bri...@vmware.com Cc: Eric Anholt e...@anholt.net Cc: Kenneth Graunke kenn...@whitecape.org Signed-off-by: Ben Widawsky b...@bwidawsk.net --- Thanks Ken. I wasn't sure if this path was common or not, and I had opted for the standard, define variables at the top, style. If it occurs more than infrequently, then I like this solution better. (FYI: Jenkins results, http://otc-gfxtest-01.jf.intel.com/job/bwidawsk/100/) --- src/mesa/drivers/common/meta.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/common/meta.c b/src/mesa/drivers/common/meta.c index fdc4cf1..fa800ec 100644 --- a/src/mesa/drivers/common/meta.c +++ b/src/mesa/drivers/common/meta.c @@ -247,7 +247,6 @@ _mesa_meta_setup_blit_shader(struct gl_context *ctx, struct blit_shader_table *table) { char *vs_source, *fs_source; - void *const mem_ctx = ralloc_context(NULL); struct blit_shader *shader = choose_blit_shader(target, table); const char *vs_input, *vs_output, *fs_input, *vs_preprocess, *fs_preprocess; @@ -273,6 +272,8 @@ _mesa_meta_setup_blit_shader(struct gl_context *ctx, return; } + void *const mem_ctx = ralloc_context(NULL); + vs_source = ralloc_asprintf(mem_ctx, %s\n %s vec2 position;\n -- 2.3.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 02/15] gallium: add new fields to pipe_driver_query_info
According to the spec of GL_AMD_performance_monitor, valid type values returned are UNSIGNED_INT, UNSIGNED_INT64_AMD, PERCENTAGE_AMD, FLOAT. This also introduces the new field group_id in order to categorize queries into groups. Signed-off-by: Samuel Pitoiset samuel.pitoi...@gmail.com --- src/gallium/include/pipe/p_defines.h | 10 ++ 1 file changed, 10 insertions(+) diff --git a/src/gallium/include/pipe/p_defines.h b/src/gallium/include/pipe/p_defines.h index 4409789..cb42cef 100644 --- a/src/gallium/include/pipe/p_defines.h +++ b/src/gallium/include/pipe/p_defines.h @@ -751,12 +751,22 @@ union pipe_color_union unsigned int ui[4]; }; +enum pipe_driver_query_type +{ + PIPE_DRIVER_QUERY_TYPE_UINT64 = 0, + PIPE_DRIVER_QUERY_TYPE_UINT = 1, + PIPE_DRIVER_QUERY_TYPE_FLOAT = 2, + PIPE_DRIVER_QUERY_TYPE_PERCENTAGE = 3, +}; + struct pipe_driver_query_info { const char *name; unsigned query_type; /* PIPE_QUERY_DRIVER_SPECIFIC + i */ uint64_t max_value; /* max value that can be returned */ boolean uses_byte_units; /* whether the result is in bytes */ + enum pipe_driver_query_type type; + unsigned group_id; }; struct pipe_driver_query_group_info -- 2.3.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 03/15] gallium: add new numeric types to pipe_query_result
This will be used by GL_AMD_performance_monitor. Signed-off-by: Samuel Pitoiset samuel.pitoi...@gmail.com --- src/gallium/include/pipe/p_defines.h | 8 1 file changed, 8 insertions(+) diff --git a/src/gallium/include/pipe/p_defines.h b/src/gallium/include/pipe/p_defines.h index cb42cef..c5721d4 100644 --- a/src/gallium/include/pipe/p_defines.h +++ b/src/gallium/include/pipe/p_defines.h @@ -732,8 +732,16 @@ union pipe_query_result /* PIPE_QUERY_TIME_ELAPSED */ /* PIPE_QUERY_PRIMITIVES_GENERATED */ /* PIPE_QUERY_PRIMITIVES_EMITTED */ + /* PIPE_DRIVER_QUERY_TYPE_UINT64 */ uint64_t u64; + /* PIPE_DRIVER_QUERY_TYPE_UINT */ + uint32_t u32; + + /* PIPE_DRIVER_QUERY_TYPE_FLOAT */ + /* PIPE_DRIVER_QUERY_TYPE_PERCENTAGE */ + float f; + /* PIPE_QUERY_SO_STATISTICS */ struct pipe_query_data_so_statistics so_statistics; -- 2.3.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 09/15] freedreno: implement pipe_screen::get_driver_query_group_info
This enables GL_AMD_performance_monitor for freedreno. Signed-off-by: Samuel Pitoiset samuel.pitoi...@gmail.com --- src/gallium/drivers/freedreno/freedreno_query.c | 9 + src/gallium/drivers/freedreno/freedreno_query.h | 1 + 2 files changed, 10 insertions(+) diff --git a/src/gallium/drivers/freedreno/freedreno_query.c b/src/gallium/drivers/freedreno/freedreno_query.c index db2683c..13973a8 100644 --- a/src/gallium/drivers/freedreno/freedreno_query.c +++ b/src/gallium/drivers/freedreno/freedreno_query.c @@ -28,6 +28,7 @@ #include pipe/p_state.h #include util/u_memory.h +#include util/u_query.h #include freedreno_query.h #include freedreno_query_sw.h @@ -104,10 +105,18 @@ fd_get_driver_query_info(struct pipe_screen *pscreen, return 1; } +static int +fd_get_driver_query_group_info(struct pipe_screen *pscreen, + unsigned index, struct pipe_driver_query_group_info *info) +{ + return util_get_driver_query_group_info(index, FD_QUERY_COUNT, info); +} + void fd_query_screen_init(struct pipe_screen *pscreen) { pscreen-get_driver_query_info = fd_get_driver_query_info; + pscreen-get_driver_query_group_info = fd_get_driver_query_group_info; } void diff --git a/src/gallium/drivers/freedreno/freedreno_query.h b/src/gallium/drivers/freedreno/freedreno_query.h index c2c71da..9cee989 100644 --- a/src/gallium/drivers/freedreno/freedreno_query.h +++ b/src/gallium/drivers/freedreno/freedreno_query.h @@ -56,6 +56,7 @@ fd_query(struct pipe_query *pq) return (struct fd_query *)pq; } +#define FD_QUERY_COUNT 6 #define FD_QUERY_DRAW_CALLS (PIPE_QUERY_DRIVER_SPECIFIC + 0) #define FD_QUERY_BATCH_TOTAL (PIPE_QUERY_DRIVER_SPECIFIC + 1) /* total # of batches (submits) */ #define FD_QUERY_BATCH_SYSMEM(PIPE_QUERY_DRIVER_SPECIFIC + 2) /* batches using system memory (GMEM bypass) */ -- 2.3.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/7] main: Cosmetic changes for Texture Buffers.
Done. On Mon, Mar 9, 2015 at 1:37 PM, Laura Ekstrand la...@jlekstrand.net wrote: Can you go and manually mark this commit and the Add entry point for TextureBufferRange as accepted in Patchwork? I don't have admin access, and my refactor of the new line caused a rebase. Thanks. Laura On Mon, Mar 9, 2015 at 1:13 PM, Laura Ekstrand la...@jlekstrand.net wrote: Oh, thanks! I didn't see the new line there when I read your review. I will remove it. On Mon, Mar 9, 2015 at 10:45 AM, Anuj Phogat anuj.pho...@gmail.com wrote: On Mon, Mar 9, 2015 at 9:43 AM, Laura Ekstrand la...@jlekstrand.net wrote: I'm confused which hunk you talking about. Can you be more specific? On Mon, Mar 9, 2015 at 8:47 AM, Anuj Phogat anuj.pho...@gmail.com wrote: On Wed, Mar 4, 2015 at 4:41 PM, Laura Ekstrand la...@jlekstrand.net wrote: Adds a useful comment and some whitespace. Fixes an error message. v2: Review from Anuj Phogat - Split rebase of Tex[ture]Buffer[Range] --- src/mesa/main/teximage.c | 12 ++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/src/mesa/main/teximage.c b/src/mesa/main/teximage.c index 706c76b..22574bd 100644 --- a/src/mesa/main/teximage.c +++ b/src/mesa/main/teximage.c @@ -5354,6 +5354,14 @@ _mesa_TexBufferRange(GLenum target, GLenum internalFormat, GLuint buffer, buffer); return; } else { + + /* OpenGL 4.5 core spec (02.02.2015) says in Section 8.9 Buffer + * Textures (PDF page 254): + *If buffer is zero, then any buffer object attached to the buffer + *texture is detached, the values offset and size are ignored and + *the state for offset and size for the buffer texture are reset to + *zero. + */ offset = 0; size = 0; } @@ -5382,8 +5390,7 @@ _mesa_TextureBuffer(GLuint texture, GLenum internalFormat, GLuint buffer) bufObj = NULL; /* Get the texture object by Name. */ - texObj = _mesa_lookup_texture_err(ctx, texture, - glTextureBuffer(texture)); + texObj = _mesa_lookup_texture_err(ctx, texture, glTextureBuffer); if (!texObj) return; @@ -5396,6 +5403,7 @@ _mesa_TextureBuffer(GLuint texture, GLenum internalFormat, GLuint buffer) bufObj, 0, buffer ? -1 : 0, glTextureBuffer); } + I meant this extra new line here. It's a nitpick. Up to you if you want to keep it. static GLboolean is_renderable_texture_format(struct gl_context *ctx, GLenum internalformat) { This hunk is unnecessary. -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 10/15] radeon: implement pipe_screen::get_driver_query_group_info
Reviewed-by: Marek Olšák marek.ol...@amd.com Marek On Mon, Mar 9, 2015 at 10:09 PM, Samuel Pitoiset samuel.pitoi...@gmail.com wrote: This enables GL_AMD_performance_monitor for radeon. Signed-off-by: Samuel Pitoiset samuel.pitoi...@gmail.com --- src/gallium/drivers/radeon/r600_pipe_common.c | 9 + src/gallium/drivers/radeon/r600_pipe_common.h | 1 + 2 files changed, 10 insertions(+) diff --git a/src/gallium/drivers/radeon/r600_pipe_common.c b/src/gallium/drivers/radeon/r600_pipe_common.c index 79acfc8..318ce46 100644 --- a/src/gallium/drivers/radeon/r600_pipe_common.c +++ b/src/gallium/drivers/radeon/r600_pipe_common.c @@ -31,6 +31,7 @@ #include util/u_memory.h #include util/u_format_s3tc.h #include util/u_upload_mgr.h +#include util/u_query.h #include vl/vl_decoder.h #include vl/vl_video_buffer.h #include radeon/radeon_video.h @@ -670,6 +671,13 @@ static int r600_get_driver_query_info(struct pipe_screen *screen, return 1; } +static int r600_get_driver_query_group_info(struct pipe_screen *screen, + unsigned index, + struct pipe_driver_query_group_info *info) +{ + return util_get_driver_query_group_info(index, R600_QUERY_COUNT, info); +} + static void r600_fence_reference(struct pipe_screen *screen, struct pipe_fence_handle **ptr, struct pipe_fence_handle *fence) @@ -828,6 +836,7 @@ bool r600_common_screen_init(struct r600_common_screen *rscreen, rscreen-b.get_compute_param = r600_get_compute_param; rscreen-b.get_paramf = r600_get_paramf; rscreen-b.get_driver_query_info = r600_get_driver_query_info; + rscreen-b.get_driver_query_group_info = r600_get_driver_query_group_info; rscreen-b.get_timestamp = r600_get_timestamp; rscreen-b.fence_finish = r600_fence_finish; rscreen-b.fence_reference = r600_fence_reference; diff --git a/src/gallium/drivers/radeon/r600_pipe_common.h b/src/gallium/drivers/radeon/r600_pipe_common.h index 43efaa3..10a0471 100644 --- a/src/gallium/drivers/radeon/r600_pipe_common.h +++ b/src/gallium/drivers/radeon/r600_pipe_common.h @@ -47,6 +47,7 @@ #define R600_RESOURCE_FLAG_FLUSHED_DEPTH (PIPE_RESOURCE_FLAG_DRV_PRIV 1) #define R600_RESOURCE_FLAG_FORCE_TILING (PIPE_RESOURCE_FLAG_DRV_PRIV 2) +#define R600_QUERY_COUNT 8 #define R600_QUERY_DRAW_CALLS (PIPE_QUERY_DRIVER_SPECIFIC + 0) #define R600_QUERY_REQUESTED_VRAM (PIPE_QUERY_DRIVER_SPECIFIC + 1) #define R600_QUERY_REQUESTED_GTT (PIPE_QUERY_DRIVER_SPECIFIC + 2) -- 2.3.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2] r600g: Use R600_MAX_VIEWPORTS instead of 16
Pushed, thanks. Marek On Wed, Feb 25, 2015 at 7:50 AM, Alexandre Demers alexandre.f.dem...@gmail.com wrote: Lets define R600_MAX_VIEWPORTS instead of using 16 here and there in the code when looping through viewports and scissors. It is easier to understand what this number represents. v2: Missed a case where R600_MAX_VIEWPORTS should have been used. Signed-off-by: Alexandre Demers alexandre.f.dem...@gmail.com --- src/gallium/drivers/r600/evergreen_state.c | 10 +- src/gallium/drivers/r600/r600_hw_context.c | 2 +- src/gallium/drivers/r600/r600_pipe.c | 2 +- src/gallium/drivers/r600/r600_pipe.h | 6 -- src/gallium/drivers/r600/r600_state.c | 4 ++-- 5 files changed, 13 insertions(+), 11 deletions(-) diff --git a/src/gallium/drivers/r600/evergreen_state.c b/src/gallium/drivers/r600/evergreen_state.c index 8aa8082..f0b04f0 100644 --- a/src/gallium/drivers/r600/evergreen_state.c +++ b/src/gallium/drivers/r600/evergreen_state.c @@ -2293,8 +2293,8 @@ static void cayman_init_atom_start_cs(struct r600_context *rctx) r600_store_context_reg(cb, R_028200_PA_SC_WINDOW_OFFSET, 0); r600_store_context_reg(cb, R_02820C_PA_SC_CLIPRECT_RULE, 0x); - r600_store_context_reg_seq(cb, R_0282D0_PA_SC_VPORT_ZMIN_0, 2 * 16); - for (tmp = 0; tmp 16; tmp++) { + r600_store_context_reg_seq(cb, R_0282D0_PA_SC_VPORT_ZMIN_0, 2 * R600_MAX_VIEWPORTS); + for (tmp = 0; tmp R600_MAX_VIEWPORTS; tmp++) { r600_store_value(cb, 0); /* R_0282D0_PA_SC_VPORT_ZMIN_0 */ r600_store_value(cb, fui(1.0)); /* R_0282D4_PA_SC_VPORT_ZMAX_0 */ } @@ -2727,8 +2727,8 @@ void evergreen_init_atom_start_cs(struct r600_context *rctx) r600_store_context_reg(cb, R_02820C_PA_SC_CLIPRECT_RULE, 0x); r600_store_context_reg(cb, R_028230_PA_SC_EDGERULE, 0x); - r600_store_context_reg_seq(cb, R_0282D0_PA_SC_VPORT_ZMIN_0, 2 * 16); - for (tmp = 0; tmp 16; tmp++) { + r600_store_context_reg_seq(cb, R_0282D0_PA_SC_VPORT_ZMIN_0, 2 * R600_MAX_VIEWPORTS); + for (tmp = 0; tmp R600_MAX_VIEWPORTS; tmp++) { r600_store_value(cb, 0); /* R_0282D0_PA_SC_VPORT_ZMIN_0 */ r600_store_value(cb, fui(1.0)); /* R_0282D4_PA_SC_VPORT_ZMAX_0 */ } @@ -3458,7 +3458,7 @@ void evergreen_init_state_functions(struct r600_context *rctx) r600_init_atom(rctx, rctx-dsa_state.atom, id++, r600_emit_cso_state, 0); r600_init_atom(rctx, rctx-poly_offset_state.atom, id++, evergreen_emit_polygon_offset, 6); r600_init_atom(rctx, rctx-rasterizer_state.atom, id++, r600_emit_cso_state, 0); - for (i = 0; i 16; i++) { + for (i = 0; i R600_MAX_VIEWPORTS; i++) { r600_init_atom(rctx, rctx-viewport[i].atom, id++, r600_emit_viewport_state, 8); r600_init_atom(rctx, rctx-scissor[i].atom, id++, evergreen_emit_scissor_state, 4); rctx-viewport[i].idx = i; diff --git a/src/gallium/drivers/r600/r600_hw_context.c b/src/gallium/drivers/r600/r600_hw_context.c index cd57eed..7961a96 100644 --- a/src/gallium/drivers/r600/r600_hw_context.c +++ b/src/gallium/drivers/r600/r600_hw_context.c @@ -307,7 +307,7 @@ void r600_begin_new_cs(struct r600_context *ctx) ctx-poly_offset_state.atom.dirty = true; ctx-vgt_state.atom.dirty = true; ctx-sample_mask.atom.dirty = true; - for (i = 0; i 16; i++) { + for (i = 0; i R600_MAX_VIEWPORTS; i++) { ctx-scissor[i].atom.dirty = true; ctx-viewport[i].atom.dirty = true; } diff --git a/src/gallium/drivers/r600/r600_pipe.c b/src/gallium/drivers/r600/r600_pipe.c index c8a0e9c..24d901e 100644 --- a/src/gallium/drivers/r600/r600_pipe.c +++ b/src/gallium/drivers/r600/r600_pipe.c @@ -374,7 +374,7 @@ static int r600_get_param(struct pipe_screen* pscreen, enum pipe_cap param) return 8; case PIPE_CAP_MAX_VIEWPORTS: - return 16; + return R600_MAX_VIEWPORTS; /* Timer queries, present when the clock frequency is non zero. */ case PIPE_CAP_QUERY_TIME_ELAPSED: diff --git a/src/gallium/drivers/r600/r600_pipe.h b/src/gallium/drivers/r600/r600_pipe.h index 7237854..ac69895 100644 --- a/src/gallium/drivers/r600/r600_pipe.h +++ b/src/gallium/drivers/r600/r600_pipe.h @@ -38,6 +38,8 @@ #define R600_NUM_ATOMS 73 +#define R600_MAX_VIEWPORTS 16 + /* read caches */ #define R600_CONTEXT_INV_VERTEX_CACHE (R600_CONTEXT_PRIVATE_FLAG 0) #define R600_CONTEXT_INV_TEX_CACHE (R600_CONTEXT_PRIVATE_FLAG 1) @@ -443,12 +445,12 @@ struct r600_context { struct r600_poly_offset_state poly_offset_state; struct r600_cso_state rasterizer_state; struct r600_sample_mask sample_mask; - struct r600_scissor_state
Re: [Mesa-dev] [PATCH 02/15] gallium: add new fields to pipe_driver_query_info
On Mon, Mar 9, 2015 at 10:09 PM, Samuel Pitoiset samuel.pitoi...@gmail.com wrote: According to the spec of GL_AMD_performance_monitor, valid type values returned are UNSIGNED_INT, UNSIGNED_INT64_AMD, PERCENTAGE_AMD, FLOAT. This also introduces the new field group_id in order to categorize queries into groups. Signed-off-by: Samuel Pitoiset samuel.pitoi...@gmail.com --- src/gallium/include/pipe/p_defines.h | 10 ++ 1 file changed, 10 insertions(+) diff --git a/src/gallium/include/pipe/p_defines.h b/src/gallium/include/pipe/p_defines.h index 4409789..cb42cef 100644 --- a/src/gallium/include/pipe/p_defines.h +++ b/src/gallium/include/pipe/p_defines.h @@ -751,12 +751,22 @@ union pipe_color_union unsigned int ui[4]; }; +enum pipe_driver_query_type +{ + PIPE_DRIVER_QUERY_TYPE_UINT64 = 0, + PIPE_DRIVER_QUERY_TYPE_UINT = 1, + PIPE_DRIVER_QUERY_TYPE_FLOAT = 2, + PIPE_DRIVER_QUERY_TYPE_PERCENTAGE = 3, What's the type of percentage? UINT64? FLOAT? +}; + struct pipe_driver_query_info { const char *name; unsigned query_type; /* PIPE_QUERY_DRIVER_SPECIFIC + i */ uint64_t max_value; /* max value that can be returned */ boolean uses_byte_units; /* whether the result is in bytes */ + enum pipe_driver_query_type type; Could you please remove uses_byte_units and add PIPE_DRIVER_QUERY_TYPE_BYTES, which should return uint64_t? Marek ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 07/15] gallium: add util_get_driver_query_group_info
It would be better to add this function to u_helpers.c/.h instead of adding new files. Marek On Mon, Mar 9, 2015 at 10:09 PM, Samuel Pitoiset samuel.pitoi...@gmail.com wrote: This function can be used to get a generic group of driver-specific queries when a driver doesn't expose any groups. Signed-off-by: Samuel Pitoiset samuel.pitoi...@gmail.com --- src/gallium/auxiliary/Makefile.sources | 1 + src/gallium/auxiliary/util/u_query.c | 50 ++ src/gallium/auxiliary/util/u_query.h | 45 ++ 3 files changed, 96 insertions(+) create mode 100644 src/gallium/auxiliary/util/u_query.c create mode 100644 src/gallium/auxiliary/util/u_query.h diff --git a/src/gallium/auxiliary/Makefile.sources b/src/gallium/auxiliary/Makefile.sources index b7174d6..8af3c34 100644 --- a/src/gallium/auxiliary/Makefile.sources +++ b/src/gallium/auxiliary/Makefile.sources @@ -265,6 +265,7 @@ C_SOURCES := \ util/u_prim.h \ util/u_pstipple.c \ util/u_pstipple.h \ + util/u_query.c \ util/u_range.h \ util/u_rect.h \ util/u_resource.c \ diff --git a/src/gallium/auxiliary/util/u_query.c b/src/gallium/auxiliary/util/u_query.c new file mode 100644 index 000..bb27ca0 --- /dev/null +++ b/src/gallium/auxiliary/util/u_query.c @@ -0,0 +1,50 @@ +/** + * + * Copyright 2015 Samuel Pitoiset samuel.pitoi...@gmail.com + * All Rights Reserved. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the + * Software), to deal in the Software without restriction, including + * without limitation the rights to use, copy, modify, merge, publish, + * distribute, sub license, and/or sell copies of the Software, and to + * permit persons to whom the Software is furnished to do so, subject to + * the following conditions: + * + * The above copyright notice and this permission notice (including the + * next paragraph) shall be included in all copies or substantial portions + * of the Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS + * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. + * IN NO EVENT SHALL THE AUTHORS AND/OR THEIR SUPPLIERS BE LIABLE FOR + * ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, + * TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE + * SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + * + **/ + +#include util/u_memory.h +#include util/u_query.h + +/** + * This function is used to get a generic group of driver-specific queries. + */ +int +util_get_driver_query_group_info(unsigned index, unsigned num_queries, + struct pipe_driver_query_group_info *info) +{ + struct pipe_driver_query_group_info list[] = { + {Driver queries, num_queries, num_queries} + }; + + if (!info) + return ARRAY_SIZE(list); + + if (index = ARRAY_SIZE(list)) + return 0; + + *info = list[index]; + return 1; +} diff --git a/src/gallium/auxiliary/util/u_query.h b/src/gallium/auxiliary/util/u_query.h new file mode 100644 index 000..05b9be9 --- /dev/null +++ b/src/gallium/auxiliary/util/u_query.h @@ -0,0 +1,45 @@ +/** + * + * Copyright 2015 Samuel Pitoiset samuel.pitoi...@gmail.com + * All Rights Reserved. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the + * Software), to deal in the Software without restriction, including + * without limitation the rights to use, copy, modify, merge, publish, + * distribute, sub license, and/or sell copies of the Software, and to + * permit persons to whom the Software is furnished to do so, subject to + * the following conditions: + * + * The above copyright notice and this permission notice (including the + * next paragraph) shall be included in all copies or substantial portions + * of the Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS + * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. + * IN NO EVENT SHALL THE AUTHORS AND/OR THEIR SUPPLIERS BE LIABLE FOR + * ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, + * TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE + * SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + * + **/ + +#ifndef
[Mesa-dev] [PATCH 10/15] radeon: implement pipe_screen::get_driver_query_group_info
This enables GL_AMD_performance_monitor for radeon. Signed-off-by: Samuel Pitoiset samuel.pitoi...@gmail.com --- src/gallium/drivers/radeon/r600_pipe_common.c | 9 + src/gallium/drivers/radeon/r600_pipe_common.h | 1 + 2 files changed, 10 insertions(+) diff --git a/src/gallium/drivers/radeon/r600_pipe_common.c b/src/gallium/drivers/radeon/r600_pipe_common.c index 79acfc8..318ce46 100644 --- a/src/gallium/drivers/radeon/r600_pipe_common.c +++ b/src/gallium/drivers/radeon/r600_pipe_common.c @@ -31,6 +31,7 @@ #include util/u_memory.h #include util/u_format_s3tc.h #include util/u_upload_mgr.h +#include util/u_query.h #include vl/vl_decoder.h #include vl/vl_video_buffer.h #include radeon/radeon_video.h @@ -670,6 +671,13 @@ static int r600_get_driver_query_info(struct pipe_screen *screen, return 1; } +static int r600_get_driver_query_group_info(struct pipe_screen *screen, + unsigned index, + struct pipe_driver_query_group_info *info) +{ + return util_get_driver_query_group_info(index, R600_QUERY_COUNT, info); +} + static void r600_fence_reference(struct pipe_screen *screen, struct pipe_fence_handle **ptr, struct pipe_fence_handle *fence) @@ -828,6 +836,7 @@ bool r600_common_screen_init(struct r600_common_screen *rscreen, rscreen-b.get_compute_param = r600_get_compute_param; rscreen-b.get_paramf = r600_get_paramf; rscreen-b.get_driver_query_info = r600_get_driver_query_info; + rscreen-b.get_driver_query_group_info = r600_get_driver_query_group_info; rscreen-b.get_timestamp = r600_get_timestamp; rscreen-b.fence_finish = r600_fence_finish; rscreen-b.fence_reference = r600_fence_reference; diff --git a/src/gallium/drivers/radeon/r600_pipe_common.h b/src/gallium/drivers/radeon/r600_pipe_common.h index 43efaa3..10a0471 100644 --- a/src/gallium/drivers/radeon/r600_pipe_common.h +++ b/src/gallium/drivers/radeon/r600_pipe_common.h @@ -47,6 +47,7 @@ #define R600_RESOURCE_FLAG_FLUSHED_DEPTH (PIPE_RESOURCE_FLAG_DRV_PRIV 1) #define R600_RESOURCE_FLAG_FORCE_TILING (PIPE_RESOURCE_FLAG_DRV_PRIV 2) +#define R600_QUERY_COUNT 8 #define R600_QUERY_DRAW_CALLS (PIPE_QUERY_DRIVER_SPECIFIC + 0) #define R600_QUERY_REQUESTED_VRAM (PIPE_QUERY_DRIVER_SPECIFIC + 1) #define R600_QUERY_REQUESTED_GTT (PIPE_QUERY_DRIVER_SPECIFIC + 2) -- 2.3.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 06/15] st/mesa: implement GL_AMD_performance_monitor
From: Christoph Bumiller e0425...@student.tuwien.ac.at This is based on the original patch of Christoph Bumiller. (source: http://people.freedesktop.org/~chrisbmr/perfmon.diff) As for the Gallium HUD, we keep a list of busy queries in a ring buffer in order to prevent stalls when reading queries. Drivers must implement get_driver_query_group_info and get_driver_query_info in order to enable this extension. Signed-off-by: Samuel Pitoiset samuel.pitoi...@gmail.com --- src/mesa/Makefile.sources | 2 + src/mesa/state_tracker/st_cb_perfmon.c | 455 + src/mesa/state_tracker/st_cb_perfmon.h | 70 + src/mesa/state_tracker/st_context.c| 4 + src/mesa/state_tracker/st_extensions.c | 3 + 5 files changed, 534 insertions(+) create mode 100644 src/mesa/state_tracker/st_cb_perfmon.c create mode 100644 src/mesa/state_tracker/st_cb_perfmon.h diff --git a/src/mesa/Makefile.sources b/src/mesa/Makefile.sources index 217be9a..e54e618 100644 --- a/src/mesa/Makefile.sources +++ b/src/mesa/Makefile.sources @@ -432,6 +432,8 @@ STATETRACKER_FILES = \ state_tracker/st_cb_flush.h \ state_tracker/st_cb_msaa.c \ state_tracker/st_cb_msaa.h \ + state_tracker/st_cb_perfmon.c \ + state_tracker/st_cb_perfmon.h \ state_tracker/st_cb_program.c \ state_tracker/st_cb_program.h \ state_tracker/st_cb_queryobj.c \ diff --git a/src/mesa/state_tracker/st_cb_perfmon.c b/src/mesa/state_tracker/st_cb_perfmon.c new file mode 100644 index 000..fb6774b --- /dev/null +++ b/src/mesa/state_tracker/st_cb_perfmon.c @@ -0,0 +1,455 @@ +/* + * Copyright (C) 2013 Christoph Bumiller + * Copyright (C) 2015 Samuel Pitoiset + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the Software), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + */ + +/** + * Performance monitoring counters interface to gallium. + */ + +#include st_context.h +#include st_cb_bitmap.h +#include st_cb_perfmon.h + +#include util/bitset.h + +#include pipe/p_context.h +#include pipe/p_screen.h +#include util/u_memory.h + +/** + * Return a PIPE_QUERY_x type = PIPE_QUERY_DRIVER_SPECIFIC, or -1 if + * the driver-specific query doesn't exist. + */ +static int +find_query_type(struct pipe_screen *screen, const char *name) +{ + int num_queries; + int type = -1; + int i; + + num_queries = screen-get_driver_query_info(screen, 0, NULL); + if (!num_queries) + return type; + + for (i = 0; i num_queries; i++) { + struct pipe_driver_query_info info; + + if (!screen-get_driver_query_info(screen, i, info)) + continue; + + if (!strncmp(info.name, name, strlen(name))) { + type = info.query_type; + break; + } + } + return type; +} + +static bool +init_perf_monitor(struct gl_context *ctx, struct gl_perf_monitor_object *m) +{ + struct st_perf_monitor_object *stm = st_perf_monitor_object(m); + struct pipe_screen *screen = st_context(ctx)-pipe-screen; + struct pipe_context *pipe = st_context(ctx)-pipe; + int gid, cid; + + st_flush_bitmap_cache(st_context(ctx)); + + /* Create a query for each active counter. */ + for (gid = 0; gid ctx-PerfMonitor.NumGroups; gid++) { + const struct gl_perf_monitor_group *g = ctx-PerfMonitor.Groups[gid]; + for (cid = 0; cid g-NumCounters; cid++) { + const struct gl_perf_monitor_counter *c = g-Counters[cid]; + struct st_perf_counter_object *cntr; + int query_type; + + if (!BITSET_TEST(m-ActiveCounters[gid], cid)) +continue; + + query_type = find_query_type(screen, c-Name); + assert(query_type != -1); + + cntr = CALLOC_STRUCT(st_perf_counter_object); + if (!cntr) +return false; + + cntr-queries[cntr-head] = pipe-create_query(pipe, query_type, 0); + cntr-query_type = query_type; + cntr-id = cid; + cntr-group_id = gid; + + list_addtail(cntr-list, stm-active_counters); + } + } +
[Mesa-dev] [PATCH 01/15] gallium: add pipe_screen::get_driver_query_group_info
Driver queries are organized as a single hierarchy where queries are categorized into groups. Each goup has a list of queries and a maximum number of queries that can be sampled. The list of available groups can be obtained using pipe_screen::get_driver_query_group_info. This will be used by GL_AMD_performance monitor. Signed-off-by: Samuel Pitoiset samuel.pitoi...@gmail.com --- src/gallium/docs/source/screen.rst | 10 ++ src/gallium/include/pipe/p_defines.h | 7 +++ src/gallium/include/pipe/p_screen.h | 11 +++ 3 files changed, 28 insertions(+) diff --git a/src/gallium/docs/source/screen.rst b/src/gallium/docs/source/screen.rst index e0fd1a2..b698492 100644 --- a/src/gallium/docs/source/screen.rst +++ b/src/gallium/docs/source/screen.rst @@ -580,3 +580,13 @@ query at the specified **index** is returned in **info**. The function returns non-zero on success. The driver-specific query is described with the pipe_driver_query_info structure. + +get_driver_query_group_info +^^^ + +Return a driver-specific query group. If the **info** parameter is NULL, +the number of available groups is returned. Otherwise, the driver +query group at the specified **index** is returned in **info**. +The function returns non-zero on success. +The driver-specific query group is described with the +pipe_driver_query_group_info structure. diff --git a/src/gallium/include/pipe/p_defines.h b/src/gallium/include/pipe/p_defines.h index a8ffe9c..4409789 100644 --- a/src/gallium/include/pipe/p_defines.h +++ b/src/gallium/include/pipe/p_defines.h @@ -759,6 +759,13 @@ struct pipe_driver_query_info boolean uses_byte_units; /* whether the result is in bytes */ }; +struct pipe_driver_query_group_info +{ + const char *name; + unsigned max_active_queries; + unsigned num_queries; +}; + #ifdef __cplusplus } #endif diff --git a/src/gallium/include/pipe/p_screen.h b/src/gallium/include/pipe/p_screen.h index ac88506..815c321 100644 --- a/src/gallium/include/pipe/p_screen.h +++ b/src/gallium/include/pipe/p_screen.h @@ -228,6 +228,17 @@ struct pipe_screen { unsigned index, struct pipe_driver_query_info *info); + /** +* Returns a driver-specific query group. +* +* If \p info is NULL, the number of available groups is returned. +* Otherwise, the driver query group at the specified \p index is returned +* in \p info. The function returns non-zero on success. +*/ + int (*get_driver_query_group_info)(struct pipe_screen *screen, + unsigned index, + struct pipe_driver_query_group_info *info); + }; -- 2.3.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 11/15] nvc0: implement pipe_screen::get_driver_query_group_info
This enables GL_AMD_performance_monitor for nvc0. Signed-off-by: Samuel Pitoiset samuel.pitoi...@gmail.com --- src/gallium/drivers/nouveau/nvc0/nvc0_query.c | 10 ++ src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 1 + src/gallium/drivers/nouveau/nvc0/nvc0_screen.h | 3 +++ 3 files changed, 14 insertions(+) diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_query.c b/src/gallium/drivers/nouveau/nvc0/nvc0_query.c index e20a0b7..919e1d6 100644 --- a/src/gallium/drivers/nouveau/nvc0/nvc0_query.c +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_query.c @@ -24,6 +24,8 @@ #define NVC0_PUSH_EXPLICIT_SPACE_CHECKING +#include util/u_query.h + #include nvc0/nvc0_context.h #include nv_object.xml.h #include nvc0/nve4_compute.xml.h @@ -1449,6 +1451,14 @@ nvc0_screen_get_driver_query_info(struct pipe_screen *pscreen, return 0; } +int +nvc0_screen_get_driver_query_group_info(struct pipe_screen *pscreen, +unsigned id, +struct pipe_driver_query_group_info *info) +{ + return util_get_driver_query_group_info(id, NVC0_QUERY_DRV_STAT_COUNT, info); +} + void nvc0_init_query_functions(struct nvc0_context *nvc0) { diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c index 686d892..3817771 100644 --- a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c @@ -650,6 +650,7 @@ nvc0_screen_create(struct nouveau_device *dev) pscreen-get_shader_param = nvc0_screen_get_shader_param; pscreen-get_paramf = nvc0_screen_get_paramf; pscreen-get_driver_query_info = nvc0_screen_get_driver_query_info; + pscreen-get_driver_query_group_info = nvc0_screen_get_driver_query_group_info; nvc0_screen_init_resource_functions(pscreen); diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.h b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.h index 8a1991f..6bf43d9 100644 --- a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.h +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.h @@ -243,6 +243,9 @@ nvc0_screen(struct pipe_screen *screen) int nvc0_screen_get_driver_query_info(struct pipe_screen *, unsigned, struct pipe_driver_query_info *); +int nvc0_screen_get_driver_query_group_info(struct pipe_screen *, unsigned, +struct pipe_driver_query_group_info *); + boolean nvc0_blitter_create(struct nvc0_screen *); void nvc0_blitter_destroy(struct nvc0_screen *); -- 2.3.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 05/15] gallium: make pipe_context::begin_query return a boolean
GL_AMD_performance_monitor must return an error when a monitoring session cannot be started. Signed-off-by: Samuel Pitoiset samuel.pitoi...@gmail.com --- src/gallium/drivers/freedreno/freedreno_query.c| 4 ++-- src/gallium/drivers/freedreno/freedreno_query.h| 2 +- src/gallium/drivers/freedreno/freedreno_query_hw.c | 3 ++- src/gallium/drivers/freedreno/freedreno_query_sw.c | 3 ++- src/gallium/drivers/galahad/glhd_context.c | 6 +++--- src/gallium/drivers/i915/i915_query.c | 3 ++- src/gallium/drivers/ilo/ilo_query.c| 3 ++- src/gallium/drivers/llvmpipe/lp_query.c| 3 ++- src/gallium/drivers/noop/noop_pipe.c | 3 ++- src/gallium/drivers/nouveau/nv30/nv30_query.c | 5 +++-- src/gallium/drivers/nouveau/nv50/nv50_query.c | 3 ++- src/gallium/drivers/nouveau/nvc0/nvc0_query.c | 3 ++- src/gallium/drivers/r300/r300_query.c | 9 + src/gallium/drivers/radeon/r600_query.c| 16 +--- src/gallium/drivers/rbug/rbug_context.c| 8 +--- src/gallium/drivers/softpipe/sp_query.c| 3 ++- src/gallium/drivers/svga/svga_pipe_query.c | 3 ++- src/gallium/drivers/trace/tr_context.c | 6 -- src/gallium/include/pipe/p_context.h | 2 +- 19 files changed, 53 insertions(+), 35 deletions(-) diff --git a/src/gallium/drivers/freedreno/freedreno_query.c b/src/gallium/drivers/freedreno/freedreno_query.c index 6f01e03..db2683c 100644 --- a/src/gallium/drivers/freedreno/freedreno_query.c +++ b/src/gallium/drivers/freedreno/freedreno_query.c @@ -59,11 +59,11 @@ fd_destroy_query(struct pipe_context *pctx, struct pipe_query *pq) q-funcs-destroy_query(fd_context(pctx), q); } -static void +static boolean fd_begin_query(struct pipe_context *pctx, struct pipe_query *pq) { struct fd_query *q = fd_query(pq); - q-funcs-begin_query(fd_context(pctx), q); + return q-funcs-begin_query(fd_context(pctx), q); } static void diff --git a/src/gallium/drivers/freedreno/freedreno_query.h b/src/gallium/drivers/freedreno/freedreno_query.h index bc9a7a2..c2c71da 100644 --- a/src/gallium/drivers/freedreno/freedreno_query.h +++ b/src/gallium/drivers/freedreno/freedreno_query.h @@ -37,7 +37,7 @@ struct fd_query; struct fd_query_funcs { void (*destroy_query)(struct fd_context *ctx, struct fd_query *q); - void (*begin_query)(struct fd_context *ctx, struct fd_query *q); + boolean (*begin_query)(struct fd_context *ctx, struct fd_query *q); void (*end_query)(struct fd_context *ctx, struct fd_query *q); boolean (*get_query_result)(struct fd_context *ctx, struct fd_query *q, boolean wait, diff --git a/src/gallium/drivers/freedreno/freedreno_query_hw.c b/src/gallium/drivers/freedreno/freedreno_query_hw.c index b29f9d4..a587178 100644 --- a/src/gallium/drivers/freedreno/freedreno_query_hw.c +++ b/src/gallium/drivers/freedreno/freedreno_query_hw.c @@ -136,7 +136,7 @@ fd_hw_begin_query(struct fd_context *ctx, struct fd_query *q) { struct fd_hw_query *hq = fd_hw_query(q); if (q-active) - return; + return true; /* begin_query() should clear previous results: */ destroy_periods(ctx, hq-periods); @@ -149,6 +149,7 @@ fd_hw_begin_query(struct fd_context *ctx, struct fd_query *q) /* add to active list: */ list_del(hq-list); list_addtail(hq-list, ctx-active_queries); + return true; } static void diff --git a/src/gallium/drivers/freedreno/freedreno_query_sw.c b/src/gallium/drivers/freedreno/freedreno_query_sw.c index 8d81698..514df14 100644 --- a/src/gallium/drivers/freedreno/freedreno_query_sw.c +++ b/src/gallium/drivers/freedreno/freedreno_query_sw.c @@ -85,7 +85,7 @@ is_rate_query(struct fd_query *q) } } -static void +static boolean fd_sw_begin_query(struct fd_context *ctx, struct fd_query *q) { struct fd_sw_query *sq = fd_sw_query(q); @@ -93,6 +93,7 @@ fd_sw_begin_query(struct fd_context *ctx, struct fd_query *q) sq-begin_value = read_counter(ctx, q-type); if (is_rate_query(q)) sq-begin_time = os_time_get(); + return true; } static void diff --git a/src/gallium/drivers/galahad/glhd_context.c b/src/gallium/drivers/galahad/glhd_context.c index 37ea170..4908df5 100644 --- a/src/gallium/drivers/galahad/glhd_context.c +++ b/src/gallium/drivers/galahad/glhd_context.c @@ -102,15 +102,15 @@ galahad_context_destroy_query(struct pipe_context *_pipe, query); } -static void +static boolean galahad_context_begin_query(struct pipe_context *_pipe, struct pipe_query *query) { struct galahad_context *glhd_pipe = galahad_context(_pipe); struct pipe_context *pipe = glhd_pipe-pipe; - pipe-begin_query(pipe, - query); + return
[Mesa-dev] [PATCH 08/15] svga: implement pipe_screen::get_driver_query_group_info
This enables GL_AMD_performance_monitor for svga. Signed-off-by: Samuel Pitoiset samuel.pitoi...@gmail.com --- src/gallium/drivers/svga/svga_context.h | 1 + src/gallium/drivers/svga/svga_screen.c | 11 +++ 2 files changed, 12 insertions(+) diff --git a/src/gallium/drivers/svga/svga_context.h b/src/gallium/drivers/svga/svga_context.h index a75f2a8..67f1816 100644 --- a/src/gallium/drivers/svga/svga_context.h +++ b/src/gallium/drivers/svga/svga_context.h @@ -45,6 +45,7 @@ /** Non-GPU queries for gallium HUD */ +#define SVGA_QUERY_COUNT3 #define SVGA_QUERY_DRAW_CALLS (PIPE_QUERY_DRIVER_SPECIFIC + 0) #define SVGA_QUERY_FALLBACKS(PIPE_QUERY_DRIVER_SPECIFIC + 1) #define SVGA_QUERY_MEMORY_USED (PIPE_QUERY_DRIVER_SPECIFIC + 2) diff --git a/src/gallium/drivers/svga/svga_screen.c b/src/gallium/drivers/svga/svga_screen.c index 6036549..44eddce 100644 --- a/src/gallium/drivers/svga/svga_screen.c +++ b/src/gallium/drivers/svga/svga_screen.c @@ -28,6 +28,7 @@ #include util/u_inlines.h #include util/u_string.h #include util/u_math.h +#include util/u_query.h #include svga_winsys.h #include svga_public.h @@ -582,6 +583,15 @@ svga_get_driver_query_info(struct pipe_screen *screen, } +static int +svga_get_driver_query_group_info(struct pipe_screen *screen, + unsigned index, + struct pipe_driver_query_group_info *info) +{ + return util_get_driver_query_group_info(index, SVGA_QUERY_COUNT, info); +} + + static void svga_destroy_screen( struct pipe_screen *screen ) { @@ -642,6 +652,7 @@ svga_screen_create(struct svga_winsys_screen *sws) screen-fence_signalled = svga_fence_signalled; screen-fence_finish = svga_fence_finish; screen-get_driver_query_info = svga_get_driver_query_info; + screen-get_driver_query_group_info = svga_get_driver_query_group_info; svgascreen-sws = sws; svga_init_screen_resource_functions(svgascreen); -- 2.3.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 14/15] nvc0: make begin_query return false when all MP counters are used
Signed-off-by: Samuel Pitoiset samuel.pitoi...@gmail.com --- src/gallium/drivers/nouveau/nvc0/nvc0_query.c | 13 - 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_query.c b/src/gallium/drivers/nouveau/nvc0/nvc0_query.c index 445110f..be6be69 100644 --- a/src/gallium/drivers/nouveau/nvc0/nvc0_query.c +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_query.c @@ -56,7 +56,8 @@ struct nvc0_query { #define NVC0_QUERY_ALLOC_SPACE 256 -static void nvc0_mp_pm_query_begin(struct nvc0_context *, struct nvc0_query *); +static boolean nvc0_mp_pm_query_begin(struct nvc0_context *, + struct nvc0_query *); static void nvc0_mp_pm_query_end(struct nvc0_context *, struct nvc0_query *); static boolean nvc0_mp_pm_query_result(struct nvc0_context *, struct nvc0_query *, void *, boolean); @@ -256,6 +257,7 @@ nvc0_query_begin(struct pipe_context *pipe, struct pipe_query *pq) struct nvc0_context *nvc0 = nvc0_context(pipe); struct nouveau_pushbuf *push = nvc0-base.pushbuf; struct nvc0_query *q = nvc0_query(pq); + boolean ret = true; /* For occlusion queries we have to change the storage, because a previous * query might set the initial render conition to FALSE even *after* we re- @@ -327,12 +329,12 @@ nvc0_query_begin(struct pipe_context *pipe, struct pipe_query *pq) #endif if ((q-type = NVE4_PM_QUERY(0) q-type = NVE4_PM_QUERY_LAST) || (q-type = NVC0_PM_QUERY(0) q-type = NVC0_PM_QUERY_LAST)) { - nvc0_mp_pm_query_begin(nvc0, q); + ret = nvc0_mp_pm_query_begin(nvc0, q); } break; } q-state = NVC0_QUERY_STATE_ACTIVE; - return true; + return ret; } static void @@ -1072,7 +1074,7 @@ nvc0_mp_pm_query_get_cfg(struct nvc0_context *nvc0, struct nvc0_query *q) return nvc0_mp_pm_queries[q-type - NVC0_PM_QUERY(0)]; } -void +boolean nvc0_mp_pm_query_begin(struct nvc0_context *nvc0, struct nvc0_query *q) { struct nvc0_screen *screen = nvc0-screen; @@ -1091,7 +1093,7 @@ nvc0_mp_pm_query_begin(struct nvc0_context *nvc0, struct nvc0_query *q) if (screen-pm.num_mp_pm_active[0] + num_ab[0] 4 || screen-pm.num_mp_pm_active[1] + num_ab[1] 4) { NOUVEAU_ERR(Not enough free MP counter slots !\n); - return; + return false; } assert(cfg-num_counters = 4); @@ -1156,6 +1158,7 @@ nvc0_mp_pm_query_begin(struct nvc0_context *nvc0, struct nvc0_query *q) } } } + return true; } static void -- 2.3.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] [v2] meta: Plug memory leak
On Monday, March 09, 2015 11:44:18 AM Ben Widawsky wrote: It looks like this has existed since commit f5a477ab76b6e0b268387699cd2253a43db0dfae Author: Ian Romanick ian.d.roman...@intel.com Date: Mon Dec 16 11:54:08 2013 -0800 meta: Refactor shader generation code out of mipmap generation path Valgrind was complaining on fbo-generatemipmap-formats v2: Instead, do the allocation after the early return block (v2) Cc: Ian Romanick ian.d.roman...@intel.com Cc: Brian Paul bri...@vmware.com Cc: Eric Anholt e...@anholt.net Cc: Kenneth Graunke kenn...@whitecape.org Signed-off-by: Ben Widawsky b...@bwidawsk.net --- Thanks Ken. I wasn't sure if this path was common or not, and I had opted for the standard, define variables at the top, style. If it occurs more than infrequently, then I like this solution better. (FYI: Jenkins results, http://otc-gfxtest-01.jf.intel.com/job/bwidawsk/100/) --- src/mesa/drivers/common/meta.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/common/meta.c b/src/mesa/drivers/common/meta.c index fdc4cf1..fa800ec 100644 --- a/src/mesa/drivers/common/meta.c +++ b/src/mesa/drivers/common/meta.c @@ -247,7 +247,6 @@ _mesa_meta_setup_blit_shader(struct gl_context *ctx, struct blit_shader_table *table) { char *vs_source, *fs_source; - void *const mem_ctx = ralloc_context(NULL); struct blit_shader *shader = choose_blit_shader(target, table); const char *vs_input, *vs_output, *fs_input, *vs_preprocess, *fs_preprocess; @@ -273,6 +272,8 @@ _mesa_meta_setup_blit_shader(struct gl_context *ctx, return; } + void *const mem_ctx = ralloc_context(NULL); + vs_source = ralloc_asprintf(mem_ctx, %s\n %s vec2 position;\n I'm not clear whether we can get away with C99 mixed declarations and code yet (in the past, it's been a problem for MSVC). Assuming you move the declaration back to the top, and leave the initialization here, this would get a: Reviewed-by: Kenneth Graunke kenn...@whitecape.org signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 15/15] nvc0: all queries use an unsigned 64-bits integer by default
Signed-off-by: Samuel Pitoiset samuel.pitoi...@gmail.com --- src/gallium/drivers/nouveau/nvc0/nvc0_query.c | 15 --- 1 file changed, 8 insertions(+), 7 deletions(-) diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_query.c b/src/gallium/drivers/nouveau/nvc0/nvc0_query.c index be6be69..1898260 100644 --- a/src/gallium/drivers/nouveau/nvc0/nvc0_query.c +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_query.c @@ -1418,6 +1418,14 @@ nvc0_screen_get_driver_query_info(struct pipe_screen *pscreen, if (!info) return count; + /* Init default values. */ + info-name = this_is_not_the_query_you_are_looking_for; + info-query_type = 0xdeadd01d; + info-group_id = 0; + info-max_value.u64 = 0; + info-uses_byte_units = FALSE; + info-type = PIPE_DRIVER_QUERY_TYPE_UINT64; + #ifdef NOUVEAU_ENABLE_DRIVER_STATISTICS if (id NVC0_QUERY_DRV_STAT_COUNT) { info-name = nvc0_drv_stat_names[id]; @@ -1435,7 +1443,6 @@ nvc0_screen_get_driver_query_info(struct pipe_screen *pscreen, info-group_id = NVC0_QUERY_MP_COUNTER_GROUP; info-max_value.u64 = (id NVE4_PM_QUERY_METRIC_MP_OCCUPANCY) ? ~0ULL : 100; - info-uses_byte_units = FALSE; return 1; } else if (screen-compute) { @@ -1443,16 +1450,10 @@ nvc0_screen_get_driver_query_info(struct pipe_screen *pscreen, info-query_type = NVC0_PM_QUERY(id - NVC0_QUERY_DRV_STAT_COUNT); info-group_id = NVC0_QUERY_MP_COUNTER_GROUP; info-max_value.u64 = ~0ULL; - info-uses_byte_units = FALSE; return 1; } } /* user asked for info about non-existing query */ - info-name = this_is_not_the_query_you_are_looking_for; - info-query_type = 0xdeadd01d; - info-group_id = 0; - info-max_value.u64 = 0; - info-uses_byte_units = FALSE; return 0; } -- 2.3.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 07/15] gallium: add util_get_driver_query_group_info
This function can be used to get a generic group of driver-specific queries when a driver doesn't expose any groups. Signed-off-by: Samuel Pitoiset samuel.pitoi...@gmail.com --- src/gallium/auxiliary/Makefile.sources | 1 + src/gallium/auxiliary/util/u_query.c | 50 ++ src/gallium/auxiliary/util/u_query.h | 45 ++ 3 files changed, 96 insertions(+) create mode 100644 src/gallium/auxiliary/util/u_query.c create mode 100644 src/gallium/auxiliary/util/u_query.h diff --git a/src/gallium/auxiliary/Makefile.sources b/src/gallium/auxiliary/Makefile.sources index b7174d6..8af3c34 100644 --- a/src/gallium/auxiliary/Makefile.sources +++ b/src/gallium/auxiliary/Makefile.sources @@ -265,6 +265,7 @@ C_SOURCES := \ util/u_prim.h \ util/u_pstipple.c \ util/u_pstipple.h \ + util/u_query.c \ util/u_range.h \ util/u_rect.h \ util/u_resource.c \ diff --git a/src/gallium/auxiliary/util/u_query.c b/src/gallium/auxiliary/util/u_query.c new file mode 100644 index 000..bb27ca0 --- /dev/null +++ b/src/gallium/auxiliary/util/u_query.c @@ -0,0 +1,50 @@ +/** + * + * Copyright 2015 Samuel Pitoiset samuel.pitoi...@gmail.com + * All Rights Reserved. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the + * Software), to deal in the Software without restriction, including + * without limitation the rights to use, copy, modify, merge, publish, + * distribute, sub license, and/or sell copies of the Software, and to + * permit persons to whom the Software is furnished to do so, subject to + * the following conditions: + * + * The above copyright notice and this permission notice (including the + * next paragraph) shall be included in all copies or substantial portions + * of the Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS + * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. + * IN NO EVENT SHALL THE AUTHORS AND/OR THEIR SUPPLIERS BE LIABLE FOR + * ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, + * TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE + * SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + * + **/ + +#include util/u_memory.h +#include util/u_query.h + +/** + * This function is used to get a generic group of driver-specific queries. + */ +int +util_get_driver_query_group_info(unsigned index, unsigned num_queries, + struct pipe_driver_query_group_info *info) +{ + struct pipe_driver_query_group_info list[] = { + {Driver queries, num_queries, num_queries} + }; + + if (!info) + return ARRAY_SIZE(list); + + if (index = ARRAY_SIZE(list)) + return 0; + + *info = list[index]; + return 1; +} diff --git a/src/gallium/auxiliary/util/u_query.h b/src/gallium/auxiliary/util/u_query.h new file mode 100644 index 000..05b9be9 --- /dev/null +++ b/src/gallium/auxiliary/util/u_query.h @@ -0,0 +1,45 @@ +/** + * + * Copyright 2015 Samuel Pitoiset samuel.pitoi...@gmail.com + * All Rights Reserved. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the + * Software), to deal in the Software without restriction, including + * without limitation the rights to use, copy, modify, merge, publish, + * distribute, sub license, and/or sell copies of the Software, and to + * permit persons to whom the Software is furnished to do so, subject to + * the following conditions: + * + * The above copyright notice and this permission notice (including the + * next paragraph) shall be included in all copies or substantial portions + * of the Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS + * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. + * IN NO EVENT SHALL THE AUTHORS AND/OR THEIR SUPPLIERS BE LIABLE FOR + * ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, + * TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE + * SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + * + **/ + +#ifndef U_QUERY_H +#define U_QUERY_H + +#ifdef __cplusplus +extern C { +#endif + +#include pipe/p_state.h + +int +util_get_driver_query_group_info(unsigned index, unsigned num_queries, + struct pipe_driver_query_group_info *info); + +#ifdef __cplusplus +} +#endif +
[Mesa-dev] [PATCH 00/15] GL_AMD_performance_monitor
Hello, A series I have waited too long to re-submit, but I recently refactored the code and fixed some minor issues. This patchset enables GL_AMD_performance_monitor for svga, freedreno, r600, radeonsi and nvc0 drivers. This code has been tested with Nouveau (NVD9 and NVE7) but it should also work with other drivers. All piglit tests, including API and measurement tests are okay. Feel free to make a review. Christoph Bumiller (1): st/mesa: implement GL_AMD_performance_monitor Samuel Pitoiset (14): gallium: add pipe_screen::get_driver_query_group_info gallium: add new fields to pipe_driver_query_info gallium: add new numeric types to pipe_query_result gallium: replace pipe_driver_query_info::max_value by a union gallium: make pipe_context::begin_query return a boolean gallium: add util_get_driver_query_group_info svga: implement pipe_screen::get_driver_query_group_info freedreno: implement pipe_screen::get_driver_query_group_info radeon: implement pipe_screen::get_driver_query_group_info nvc0: implement pipe_screen::get_driver_query_group_info docs: mark GL_AMD_performance_monitor for the 10.6.0 release nvc0: expose more driver-specific query groups nvc0: make begin_query return false when all MP counters are used nvc0: all queries use an unsigned 64-bits integer by default docs/relnotes/10.6.0.html | 1 + src/gallium/auxiliary/Makefile.sources | 1 + src/gallium/auxiliary/hud/hud_driver_query.c | 2 +- src/gallium/auxiliary/util/u_query.c | 50 +++ src/gallium/auxiliary/util/u_query.h | 45 ++ src/gallium/docs/source/screen.rst | 10 + src/gallium/drivers/freedreno/freedreno_query.c| 25 +- src/gallium/drivers/freedreno/freedreno_query.h| 3 +- src/gallium/drivers/freedreno/freedreno_query_hw.c | 3 +- src/gallium/drivers/freedreno/freedreno_query_sw.c | 3 +- src/gallium/drivers/galahad/glhd_context.c | 6 +- src/gallium/drivers/i915/i915_query.c | 3 +- src/gallium/drivers/ilo/ilo_query.c| 3 +- src/gallium/drivers/llvmpipe/lp_query.c| 3 +- src/gallium/drivers/noop/noop_pipe.c | 3 +- src/gallium/drivers/nouveau/nv30/nv30_query.c | 5 +- src/gallium/drivers/nouveau/nv50/nv50_query.c | 3 +- src/gallium/drivers/nouveau/nvc0/nvc0_query.c | 98 - src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 1 + src/gallium/drivers/nouveau/nvc0/nvc0_screen.h | 14 + src/gallium/drivers/r300/r300_query.c | 9 +- src/gallium/drivers/radeon/r600_pipe_common.c | 25 +- src/gallium/drivers/radeon/r600_pipe_common.h | 1 + src/gallium/drivers/radeon/r600_query.c| 16 +- src/gallium/drivers/rbug/rbug_context.c| 8 +- src/gallium/drivers/softpipe/sp_query.c| 3 +- src/gallium/drivers/svga/svga_context.h| 1 + src/gallium/drivers/svga/svga_pipe_query.c | 3 +- src/gallium/drivers/svga/svga_screen.c | 17 +- src/gallium/drivers/trace/tr_context.c | 6 +- src/gallium/include/pipe/p_context.h | 2 +- src/gallium/include/pipe/p_defines.h | 34 +- src/gallium/include/pipe/p_screen.h| 11 + src/mesa/Makefile.sources | 2 + src/mesa/state_tracker/st_cb_perfmon.c | 455 + src/mesa/state_tracker/st_cb_perfmon.h | 70 src/mesa/state_tracker/st_context.c| 4 + src/mesa/state_tracker/st_extensions.c | 3 + 38 files changed, 885 insertions(+), 67 deletions(-) create mode 100644 src/gallium/auxiliary/util/u_query.c create mode 100644 src/gallium/auxiliary/util/u_query.h create mode 100644 src/mesa/state_tracker/st_cb_perfmon.c create mode 100644 src/mesa/state_tracker/st_cb_perfmon.h -- 2.3.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 13/15] nvc0: expose more driver-specific query groups
This patch exposes Driver statistics and MP counters groups. Signed-off-by: Samuel Pitoiset samuel.pitoi...@gmail.com --- src/gallium/drivers/nouveau/nvc0/nvc0_query.c | 61 -- src/gallium/drivers/nouveau/nvc0/nvc0_screen.h | 11 + 2 files changed, 69 insertions(+), 3 deletions(-) diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_query.c b/src/gallium/drivers/nouveau/nvc0/nvc0_query.c index 919e1d6..445110f 100644 --- a/src/gallium/drivers/nouveau/nvc0/nvc0_query.c +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_query.c @@ -24,8 +24,6 @@ #define NVC0_PUSH_EXPLICIT_SPACE_CHECKING -#include util/u_query.h - #include nvc0/nvc0_context.h #include nv_object.xml.h #include nvc0/nve4_compute.xml.h @@ -1421,6 +1419,7 @@ nvc0_screen_get_driver_query_info(struct pipe_screen *pscreen, if (id NVC0_QUERY_DRV_STAT_COUNT) { info-name = nvc0_drv_stat_names[id]; info-query_type = NVC0_QUERY_DRV_STAT(id); + info-group_id = NVC0_QUERY_DRV_STAT_GROUP; info-max_value.u64 = ~0ULL; info-uses_byte_units = !!strstr(info-name, bytes); return 1; @@ -1430,6 +1429,7 @@ nvc0_screen_get_driver_query_info(struct pipe_screen *pscreen, if (screen-base.class_3d = NVE4_3D_CLASS) { info-name = nve4_pm_query_names[id - NVC0_QUERY_DRV_STAT_COUNT]; info-query_type = NVE4_PM_QUERY(id - NVC0_QUERY_DRV_STAT_COUNT); + info-group_id = NVC0_QUERY_MP_COUNTER_GROUP; info-max_value.u64 = (id NVE4_PM_QUERY_METRIC_MP_OCCUPANCY) ? ~0ULL : 100; info-uses_byte_units = FALSE; @@ -1438,6 +1438,7 @@ nvc0_screen_get_driver_query_info(struct pipe_screen *pscreen, if (screen-compute) { info-name = nvc0_pm_query_names[id - NVC0_QUERY_DRV_STAT_COUNT]; info-query_type = NVC0_PM_QUERY(id - NVC0_QUERY_DRV_STAT_COUNT); + info-group_id = NVC0_QUERY_MP_COUNTER_GROUP; info-max_value.u64 = ~0ULL; info-uses_byte_units = FALSE; return 1; @@ -1446,6 +1447,7 @@ nvc0_screen_get_driver_query_info(struct pipe_screen *pscreen, /* user asked for info about non-existing query */ info-name = this_is_not_the_query_you_are_looking_for; info-query_type = 0xdeadd01d; + info-group_id = 0; info-max_value.u64 = 0; info-uses_byte_units = FALSE; return 0; @@ -1456,7 +1458,60 @@ nvc0_screen_get_driver_query_group_info(struct pipe_screen *pscreen, unsigned id, struct pipe_driver_query_group_info *info) { - return util_get_driver_query_group_info(id, NVC0_QUERY_DRV_STAT_COUNT, info); + struct nvc0_screen *screen = nvc0_screen(pscreen); + int count = 0; + +#ifdef NOUVEAU_ENABLE_DRIVER_STATISTICS + count++; +#endif + if (screen-base.device-drm_version = 0x01000101) { + if (screen-base.class_3d = NVE4_3D_CLASS) { + count++; + } else + if (screen-compute) { + count++; /* NVC0_COMPUTE is not always enabled */ + } + } + + if (!info) + return count; + +#ifdef NOUVEAU_ENABLE_DRIVER_STATISTICS + if (id == NVC0_QUERY_DRV_STAT_GROUP) { + info-name = Driver statistics; + info-max_active_queries = NVC0_QUERY_DRV_STAT_COUNT; + info-num_queries = NVC0_QUERY_DRV_STAT_COUNT; + return 1; + } else +#endif + if (id == NVC0_QUERY_MP_COUNTER_GROUP) { + info-name = MP counters; + + if (screen-base.class_3d = NVE4_3D_CLASS) { + info-num_queries = NVE4_PM_QUERY_COUNT; + + /* On NVE4+, each multiprocessor have 8 hardware counters separated + * in two distinct domains, but we allow only one active query + * simultaneously because some of them use more than one hardware + * counter and this will result in an undefined behaviour. */ + info-max_active_queries = 1; /* TODO: handle multiple hw counters */ + return 1; + } else + if (screen-compute) { + info-num_queries = NVC0_PM_QUERY_COUNT; + + /* On NVC0:NVE4, each multiprocessor have 8 hardware counters + * in a single domain. */ + info-max_active_queries = 8; + return 1; + } + } + + /* user asked for info about non-existing query group */ + info-name = this_is_not_the_query_group_you_are_looking_for; + info-max_active_queries = 0; + info-num_queries = 0; + return 0; } void diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.h b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.h index 6bf43d9..b7c53c6 100644 --- a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.h +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.h @@ -234,10 +234,21 @@ nvc0_screen(struct pipe_screen *screen) #define NVC0_QUERY_DRV_STAT_PUSHBUF_COUNT 27 #define NVC0_QUERY_DRV_STAT_RESOURCE_VALIDATE_COUNT 28 +/* + * Query groups: + */ +#define NVC0_QUERY_DRV_STAT_GROUP 0 +#define NVC0_QUERY_MP_COUNTER_GROUP 1 + #else #define
Re: [Mesa-dev] [PATCH 02/15] gallium: add new fields to pipe_driver_query_info
On 03/09/2015 10:36 PM, Marek Olšák wrote: On Mon, Mar 9, 2015 at 10:09 PM, Samuel Pitoiset samuel.pitoi...@gmail.com wrote: According to the spec of GL_AMD_performance_monitor, valid type values returned are UNSIGNED_INT, UNSIGNED_INT64_AMD, PERCENTAGE_AMD, FLOAT. This also introduces the new field group_id in order to categorize queries into groups. Signed-off-by: Samuel Pitoiset samuel.pitoi...@gmail.com --- src/gallium/include/pipe/p_defines.h | 10 ++ 1 file changed, 10 insertions(+) diff --git a/src/gallium/include/pipe/p_defines.h b/src/gallium/include/pipe/p_defines.h index 4409789..cb42cef 100644 --- a/src/gallium/include/pipe/p_defines.h +++ b/src/gallium/include/pipe/p_defines.h @@ -751,12 +751,22 @@ union pipe_color_union unsigned int ui[4]; }; +enum pipe_driver_query_type +{ + PIPE_DRIVER_QUERY_TYPE_UINT64 = 0, + PIPE_DRIVER_QUERY_TYPE_UINT = 1, + PIPE_DRIVER_QUERY_TYPE_FLOAT = 2, + PIPE_DRIVER_QUERY_TYPE_PERCENTAGE = 3, What's the type of percentage? UINT64? FLOAT? Numeric types are described in the following patch, but UINT64 is uint64_t, UINT is uint32_t, FLOAT and PERCENTAGE are float. +}; + struct pipe_driver_query_info { const char *name; unsigned query_type; /* PIPE_QUERY_DRIVER_SPECIFIC + i */ uint64_t max_value; /* max value that can be returned */ boolean uses_byte_units; /* whether the result is in bytes */ + enum pipe_driver_query_type type; Could you please remove uses_byte_units and add PIPE_DRIVER_QUERY_TYPE_BYTES, which should return uint64_t? Yeah, good idea! I'll make this change and submit a v2 in few days. Marek ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 07/15] gallium: add util_get_driver_query_group_info
If you plan to add more functions, this file can stay. Marek On Mon, Mar 9, 2015 at 10:54 PM, Samuel Pitoiset samuel.pitoi...@gmail.com wrote: On 03/09/2015 10:43 PM, Marek Olšák wrote: It would be better to add this function to u_helpers.c/.h instead of adding new files. Mmh, I'll probably introduce other functions related to queries when nouveau-perfkit will be ready. Are you sure it's a good idea to drop this file? Marek On Mon, Mar 9, 2015 at 10:09 PM, Samuel Pitoiset samuel.pitoi...@gmail.com wrote: This function can be used to get a generic group of driver-specific queries when a driver doesn't expose any groups. Signed-off-by: Samuel Pitoiset samuel.pitoi...@gmail.com --- src/gallium/auxiliary/Makefile.sources | 1 + src/gallium/auxiliary/util/u_query.c | 50 ++ src/gallium/auxiliary/util/u_query.h | 45 ++ 3 files changed, 96 insertions(+) create mode 100644 src/gallium/auxiliary/util/u_query.c create mode 100644 src/gallium/auxiliary/util/u_query.h diff --git a/src/gallium/auxiliary/Makefile.sources b/src/gallium/auxiliary/Makefile.sources index b7174d6..8af3c34 100644 --- a/src/gallium/auxiliary/Makefile.sources +++ b/src/gallium/auxiliary/Makefile.sources @@ -265,6 +265,7 @@ C_SOURCES := \ util/u_prim.h \ util/u_pstipple.c \ util/u_pstipple.h \ + util/u_query.c \ util/u_range.h \ util/u_rect.h \ util/u_resource.c \ diff --git a/src/gallium/auxiliary/util/u_query.c b/src/gallium/auxiliary/util/u_query.c new file mode 100644 index 000..bb27ca0 --- /dev/null +++ b/src/gallium/auxiliary/util/u_query.c @@ -0,0 +1,50 @@ +/** + * + * Copyright 2015 Samuel Pitoiset samuel.pitoi...@gmail.com + * All Rights Reserved. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the + * Software), to deal in the Software without restriction, including + * without limitation the rights to use, copy, modify, merge, publish, + * distribute, sub license, and/or sell copies of the Software, and to + * permit persons to whom the Software is furnished to do so, subject to + * the following conditions: + * + * The above copyright notice and this permission notice (including the + * next paragraph) shall be included in all copies or substantial portions + * of the Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS + * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. + * IN NO EVENT SHALL THE AUTHORS AND/OR THEIR SUPPLIERS BE LIABLE FOR + * ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, + * TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE + * SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + * + **/ + +#include util/u_memory.h +#include util/u_query.h + +/** + * This function is used to get a generic group of driver-specific queries. + */ +int +util_get_driver_query_group_info(unsigned index, unsigned num_queries, + struct pipe_driver_query_group_info *info) +{ + struct pipe_driver_query_group_info list[] = { + {Driver queries, num_queries, num_queries} + }; + + if (!info) + return ARRAY_SIZE(list); + + if (index = ARRAY_SIZE(list)) + return 0; + + *info = list[index]; + return 1; +} diff --git a/src/gallium/auxiliary/util/u_query.h b/src/gallium/auxiliary/util/u_query.h new file mode 100644 index 000..05b9be9 --- /dev/null +++ b/src/gallium/auxiliary/util/u_query.h @@ -0,0 +1,45 @@ +/** + * + * Copyright 2015 Samuel Pitoiset samuel.pitoi...@gmail.com + * All Rights Reserved. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the + * Software), to deal in the Software without restriction, including + * without limitation the rights to use, copy, modify, merge, publish, + * distribute, sub license, and/or sell copies of the Software, and to + * permit persons to whom the Software is furnished to do so, subject to + * the following conditions: + * + * The above copyright notice and this permission notice (including the + * next paragraph) shall be included in all copies or substantial portions + * of the Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS + * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT.
Re: [Mesa-dev] [PATCH 4/7] main: Cosmetic changes for Texture Buffers.
Can you go and manually mark this commit and the Add entry point for TextureBufferRange as accepted in Patchwork? I don't have admin access, and my refactor of the new line caused a rebase. Thanks. Laura On Mon, Mar 9, 2015 at 1:13 PM, Laura Ekstrand la...@jlekstrand.net wrote: Oh, thanks! I didn't see the new line there when I read your review. I will remove it. On Mon, Mar 9, 2015 at 10:45 AM, Anuj Phogat anuj.pho...@gmail.com wrote: On Mon, Mar 9, 2015 at 9:43 AM, Laura Ekstrand la...@jlekstrand.net wrote: I'm confused which hunk you talking about. Can you be more specific? On Mon, Mar 9, 2015 at 8:47 AM, Anuj Phogat anuj.pho...@gmail.com wrote: On Wed, Mar 4, 2015 at 4:41 PM, Laura Ekstrand la...@jlekstrand.net wrote: Adds a useful comment and some whitespace. Fixes an error message. v2: Review from Anuj Phogat - Split rebase of Tex[ture]Buffer[Range] --- src/mesa/main/teximage.c | 12 ++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/src/mesa/main/teximage.c b/src/mesa/main/teximage.c index 706c76b..22574bd 100644 --- a/src/mesa/main/teximage.c +++ b/src/mesa/main/teximage.c @@ -5354,6 +5354,14 @@ _mesa_TexBufferRange(GLenum target, GLenum internalFormat, GLuint buffer, buffer); return; } else { + + /* OpenGL 4.5 core spec (02.02.2015) says in Section 8.9 Buffer + * Textures (PDF page 254): + *If buffer is zero, then any buffer object attached to the buffer + *texture is detached, the values offset and size are ignored and + *the state for offset and size for the buffer texture are reset to + *zero. + */ offset = 0; size = 0; } @@ -5382,8 +5390,7 @@ _mesa_TextureBuffer(GLuint texture, GLenum internalFormat, GLuint buffer) bufObj = NULL; /* Get the texture object by Name. */ - texObj = _mesa_lookup_texture_err(ctx, texture, - glTextureBuffer(texture)); + texObj = _mesa_lookup_texture_err(ctx, texture, glTextureBuffer); if (!texObj) return; @@ -5396,6 +5403,7 @@ _mesa_TextureBuffer(GLuint texture, GLenum internalFormat, GLuint buffer) bufObj, 0, buffer ? -1 : 0, glTextureBuffer); } + I meant this extra new line here. It's a nitpick. Up to you if you want to keep it. static GLboolean is_renderable_texture_format(struct gl_context *ctx, GLenum internalformat) { This hunk is unnecessary. -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] INTEL_DEBUG=shader_time scalar backend fixes
On Sun, Mar 8, 2015 at 1:08 AM, Kenneth Graunke kenn...@whitecape.org wrote: Welcome to the rabbit trail. In order to fix Football Manager, I had to rework INTEL_DEBUG=shader_time in the FS backend. While doing that, I hit two assertion failures. After fixing that, I compared numbers. I noticed that VS again accounted for 0 cycles. Matt - could you take a look at brw_vec4_dead_code_elimination.cpp? Since 5df88c2096281f (the rewrite to use live intervals), it's again completely eating the atomic buffer offset initialization, resulting in us using garbage data in the first register (of an mlen 2 message). (Sounds like the same bug I fixed in d0575d98fc595dcc17706dc73d1eb4.) With these patches, and the new vec4 DCE pass reverted, 10.3 vs my branch produce basically the same numbers on an openarena timedemo. Patches 1-3 are Reviewed-by: Matt Turner matts...@gmail.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/7] main: Cosmetic changes for Texture Buffers.
Oh, thanks! I didn't see the new line there when I read your review. I will remove it. On Mon, Mar 9, 2015 at 10:45 AM, Anuj Phogat anuj.pho...@gmail.com wrote: On Mon, Mar 9, 2015 at 9:43 AM, Laura Ekstrand la...@jlekstrand.net wrote: I'm confused which hunk you talking about. Can you be more specific? On Mon, Mar 9, 2015 at 8:47 AM, Anuj Phogat anuj.pho...@gmail.com wrote: On Wed, Mar 4, 2015 at 4:41 PM, Laura Ekstrand la...@jlekstrand.net wrote: Adds a useful comment and some whitespace. Fixes an error message. v2: Review from Anuj Phogat - Split rebase of Tex[ture]Buffer[Range] --- src/mesa/main/teximage.c | 12 ++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/src/mesa/main/teximage.c b/src/mesa/main/teximage.c index 706c76b..22574bd 100644 --- a/src/mesa/main/teximage.c +++ b/src/mesa/main/teximage.c @@ -5354,6 +5354,14 @@ _mesa_TexBufferRange(GLenum target, GLenum internalFormat, GLuint buffer, buffer); return; } else { + + /* OpenGL 4.5 core spec (02.02.2015) says in Section 8.9 Buffer + * Textures (PDF page 254): + *If buffer is zero, then any buffer object attached to the buffer + *texture is detached, the values offset and size are ignored and + *the state for offset and size for the buffer texture are reset to + *zero. + */ offset = 0; size = 0; } @@ -5382,8 +5390,7 @@ _mesa_TextureBuffer(GLuint texture, GLenum internalFormat, GLuint buffer) bufObj = NULL; /* Get the texture object by Name. */ - texObj = _mesa_lookup_texture_err(ctx, texture, - glTextureBuffer(texture)); + texObj = _mesa_lookup_texture_err(ctx, texture, glTextureBuffer); if (!texObj) return; @@ -5396,6 +5403,7 @@ _mesa_TextureBuffer(GLuint texture, GLenum internalFormat, GLuint buffer) bufObj, 0, buffer ? -1 : 0, glTextureBuffer); } + I meant this extra new line here. It's a nitpick. Up to you if you want to keep it. static GLboolean is_renderable_texture_format(struct gl_context *ctx, GLenum internalformat) { This hunk is unnecessary. -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] r300g: Fix the ATI1N swizzle.
I'm going to push this shortly. Thanks. Marek On Mon, Mar 9, 2015 at 4:15 PM, Stefan Dösinger stefandoesin...@gmx.at wrote: This fixes the GL_COMPRESSED_RED_RGTC1 part of piglit's rgtc-teximage-01 test as well as the precision part of Wine's 3dc format test (fd.o bug 89156). The Z component seems to contain a lower precision version of the result, probably a temporary value from the decompression computation. The Y and W component contain different data that depends on the input values as well, but I could not make sense of them (Not that I tried very hard). GL_COMPRESSED_SIGNED_RED_RGTC1 still seems to have precision problems in piglit, and both formats are affected by a compiler bug if they're sampled by the shader with a swizzle other than .xyzw. Wine uses ., which returns random garbage. --- src/gallium/drivers/r300/r300_texture.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/src/gallium/drivers/r300/r300_texture.c b/src/gallium/drivers/r300/r300_texture.c index ffe8c00..340b8fc 100644 --- a/src/gallium/drivers/r300/r300_texture.c +++ b/src/gallium/drivers/r300/r300_texture.c @@ -176,7 +176,9 @@ uint32_t r300_translate_texformat(enum pipe_format format, format != PIPE_FORMAT_RGTC2_UNORM format != PIPE_FORMAT_RGTC2_SNORM format != PIPE_FORMAT_LATC2_UNORM -format != PIPE_FORMAT_LATC2_SNORM) { +format != PIPE_FORMAT_LATC2_SNORM +format != PIPE_FORMAT_RGTC1_UNORM +format != PIPE_FORMAT_LATC1_UNORM) { result |= r300_get_swizzle_combined(desc-swizzle, swizzle_view, TRUE); } else { -- 2.0.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/5] i965/fs: Make get_timestamp() pass back the MOV rather than emitting it.
On Sun, Mar 8, 2015 at 1:08 AM, Kenneth Graunke kenn...@whitecape.org wrote: This makes another part of the INTEL_DEBUG=shader_time code emittable at arbitrary locations, rather than just at the end of the instruction stream. v2: Don't lose smear! Caught by Topi Pohjolainen. v3: Don't set smear on the destination of the MOV. Thanks Topi! Signed-off-by: Kenneth Graunke kenn...@whitecape.org --- src/mesa/drivers/dri/i965/brw_fs.cpp | 19 +++ src/mesa/drivers/dri/i965/brw_fs.h | 2 +- 2 files changed, 16 insertions(+), 5 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index 682841b..8f11600 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -677,8 +677,14 @@ fs_visitor::type_size(const struct glsl_type *type) return 0; } +/** + * Create a MOV to read the timestamp register. + * + * The caller is responsible for emitting the MOV. The return value is + * the destination of the MOV, with extra parameters set. + */ fs_reg -fs_visitor::get_timestamp() +fs_visitor::get_timestamp(fs_inst **out_mov) { assert(brw-gen = 7); @@ -689,7 +695,7 @@ fs_visitor::get_timestamp() fs_reg dst = fs_reg(GRF, alloc.allocate(1), BRW_REGISTER_TYPE_UD, 4); - fs_inst *mov = emit(MOV(dst, ts)); + fs_inst *mov = MOV(dst, ts); /* We want to read the 3 fields we care about even if it's not enabled in * the dispatch. */ @@ -707,6 +713,7 @@ fs_visitor::get_timestamp() */ dst.set_smear(0); + *out_mov = mov; return dst; } @@ -714,7 +721,9 @@ void fs_visitor::emit_shader_time_begin() { current_annotation = shader time start; - shader_start_time = get_timestamp(); + fs_inst *mov; + shader_start_time = get_timestamp(mov); + emit(mov); What do you think about returning the mov instruction from get_timestamp, and then just doing shader_start_time = mov-dst? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] [v2] i965/skl: Disable partial resolve in VC
On Thu, Feb 26, 2015 at 6:14 PM, Ben Widawsky benjamin.widaw...@intel.com wrote: Recomendation [sic] is to set this field to 1 always. Programming it to default value of 0, may have -ve impact on performance for MSAA WLs. Another don't suck bit which needs to get set. Totally untested. v2: v1 was a mix of two patches. Since 0x7004 is masked, we only need to set it once at initialization and make sure the pma workaround doesn't set the mask bit (which it doesn't). Move LRI to init gpu state (Ken) Add a comment. ... still untested. Cc: Kenneth Graunke kenn...@whitecape.org Signed-off-by: Ben Widawsky b...@bwidawsk.net --- src/mesa/drivers/dri/i965/brw_state_upload.c | 10 ++ src/mesa/drivers/dri/i965/intel_reg.h| 1 + 2 files changed, 11 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_state_upload.c b/src/mesa/drivers/dri/i965/brw_state_upload.c index 1b84859..c90a34f 100644 --- a/src/mesa/drivers/dri/i965/brw_state_upload.c +++ b/src/mesa/drivers/dri/i965/brw_state_upload.c @@ -337,6 +337,16 @@ brw_upload_initial_gpu_state(struct brw_context *brw) brw_upload_invariant_state(brw); + /* Recommended optimization for Victim Cache eviction in pixel backend. */ + if (brw-gen = 9) { + BEGIN_BATCH(3); + OUT_BATCH(MI_LOAD_REGISTER_IMM | (3 - 2)); + OUT_BATCH(GEN7_CACHE_MODE_1); + OUT_BATCH((GEN9_PARTIAL_RESOLVE_DISABLE_IN_VC 16) | +GEN9_PARTIAL_RESOLVE_DISABLE_IN_VC); + ADVANCE_BATCH(); + } + if (brw-gen = 8) { gen8_emit_3dstate_sample_pattern(brw); } diff --git a/src/mesa/drivers/dri/i965/intel_reg.h b/src/mesa/drivers/dri/i965/intel_reg.h index af1c1df..a4bcf3d 100644 --- a/src/mesa/drivers/dri/i965/intel_reg.h +++ b/src/mesa/drivers/dri/i965/intel_reg.h @@ -144,5 +144,6 @@ #define GEN7_CACHE_MODE_1 0x7004 # define GEN8_HIZ_NP_PMA_FIX_ENABLE(1 11) # define GEN8_HIZ_NP_EARLY_Z_FAILS_DISABLE (1 13) +# define GEN9_PARTIAL_RESOLVE_DISABLE_IN_VC (1 1) # define GEN8_HIZ_PMA_MASK_BITS \ ((GEN8_HIZ_NP_PMA_FIX_ENABLE | GEN8_HIZ_NP_EARLY_Z_FAILS_DISABLE) 16) -- 2.3.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev Reviewed-by: Anuj Phogat anuj.pho...@gmail.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 77449] Tracker bug for all bugs related to Steam titles
https://bugs.freedesktop.org/show_bug.cgi?id=77449 Bug 77449 depends on bug 86747, which changed state. Bug 86747 Summary: Noise in Football Manager 2014 textures https://bugs.freedesktop.org/show_bug.cgi?id=86747 What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/4] mesa: Separate PBO validation checks from buffer mapping, to allow reuse
On Thu, Mar 5, 2015 at 12:20 AM, Eduardo Lima Mitev el...@igalia.com wrote: Internal PBO functions such as _mesa_map_validate_pbo_source() and _mesa_validate_pbo_compressed_teximage() perform validation and buffer mapping within the same call. This patch takes out the validation into separate functions to allow reuse of functionality by other code (i.e, gl(Compressed)Tex(Sub)Image). --- src/mesa/main/pbo.c | 117 ++-- src/mesa/main/pbo.h | 14 +++ 2 files changed, 100 insertions(+), 31 deletions(-) diff --git a/src/mesa/main/pbo.c b/src/mesa/main/pbo.c index 259f763..5ae248c 100644 --- a/src/mesa/main/pbo.c +++ b/src/mesa/main/pbo.c @@ -164,23 +164,18 @@ _mesa_map_pbo_source(struct gl_context *ctx, return buf; } - /** - * Combine PBO-read validation and mapping. + * Perform PBO validation for read operations with uncompressed textures. * If any GL errors are detected, they'll be recorded and NULL returned. * \sa _mesa_validate_pbo_access - * \sa _mesa_map_pbo_source - * A call to this function should have a matching call to - * _mesa_unmap_pbo_source(). */ -const GLvoid * -_mesa_map_validate_pbo_source(struct gl_context *ctx, - GLuint dimensions, - const struct gl_pixelstore_attrib *unpack, - GLsizei width, GLsizei height, GLsizei depth, - GLenum format, GLenum type, - GLsizei clientMemSize, - const GLvoid *ptr, const char *where) +bool +_mesa_validate_pbo_source(struct gl_context *ctx, GLuint dimensions, + const struct gl_pixelstore_attrib *unpack, + GLsizei width, GLsizei height, GLsizei depth, + GLenum format, GLenum type, + GLsizei clientMemSize, + const GLvoid *ptr, const char *where) { assert(dimensions == 1 || dimensions == 2 || dimensions == 3); @@ -188,26 +183,87 @@ _mesa_map_validate_pbo_source(struct gl_context *ctx, format, type, clientMemSize, ptr)) { if (_mesa_is_bufferobj(unpack-BufferObj)) { _mesa_error(ctx, GL_INVALID_OPERATION, - %s(out of bounds PBO access), where); Changing the error to %s%uD, where, dimensions is not ideal because the other function that calls _mesa_map_validate_pbo_source is _mesa_PolygonStipple, and printing glPolygonStipple2D doesn't make sense because the name of the function is glPolygonStipple. + %s%uD(out of bounds PBO access), + where, dimensions); } else { _mesa_error(ctx, GL_INVALID_OPERATION, Why did you remove bufSize (%d) is too small? If we are going to remove that, maybe we should remove the if, then, else block because the error messages are pretty much the same. I recommend keeping the original error messages for both and moving the %sD up to the calling functions (for example, texture_error_check). - %s(out of bounds access: bufSize (%d) is too small), - where, clientMemSize); + %s%uD(out of bounds access), + where, dimensions); } - return NULL; + return false; } if (!_mesa_is_bufferobj(unpack-BufferObj)) { /* non-PBO access: no further validation to be done */ - return ptr; + return true; } if (_mesa_check_disallowed_mapping(unpack-BufferObj)) { /* buffer is already mapped - that's an error */ - _mesa_error(ctx, GL_INVALID_OPERATION, %s(PBO is mapped), where); + _mesa_error(ctx, GL_INVALID_OPERATION, %s%uD(PBO is mapped), + where, dimensions); + return false; + } + + return true; +} + +/** + * Perform PBO validation for read operations with compressed textures. + * If any GL errors are detected, they'll be recorded and NULL returned. + */ +bool +_mesa_validate_pbo_source_compressed(struct gl_context *ctx, GLuint dimensions, + const struct gl_pixelstore_attrib *unpack, + GLsizei imageSize, const GLvoid *pixels, + const char *where) +{ + if (!_mesa_is_bufferobj(unpack-BufferObj)) { + /* not using a PBO */ + return true; + } + + if ((const GLubyte *) pixels + imageSize + ((const GLubyte *) 0) + unpack-BufferObj-Size) { + /* out of bounds read! */ + _mesa_error(ctx, GL_INVALID_OPERATION, %s%uD(invalid PBO access), + where, dimensions); This should be return false. return NULL; } + if (_mesa_check_disallowed_mapping(unpack-BufferObj)) { + /* buffer is already mapped - that's an error */ +
Re: [Mesa-dev] [RFC] i965: Factor out descriptor building for indirect send messages
Pohjolainen, Topi topi.pohjolai...@intel.com writes: On Sat, Mar 07, 2015 at 04:15:08PM +0200, Francisco Jerez wrote: Topi Pohjolainen topi.pohjolai...@intel.com writes: The original patch from Curro was based on something that is not present in the master yet. This patch tries to mimick the logic on top master. I wanted to see if could separate the building of the descriptor instruction from building of the send instruction. This logic now allows the caller to construct any kind of sequence of instructions filling in the descriptor before giving it to the send instruction builder. This is only compile tested. Curro, how would you feel about this sort of approach? I apologise for muddying the waters again but I wasn't entirely comfortable with the logic and wanted to try to something else. I believe patch number five should go nicely on top of this as the descriptor instruction could be followed by (or preceeeded by) any additional instructions modifying the descriptor register before the actual send instruction. Topi, the goal I had in mind with PATCH 01 was to refactor a commonly recurring pattern. In terms of the functions defined in this patch my example from yesterday [1] would now look like: | if (index.file == BRW_IMMEDIATE_VALUE) { | | uint32_t surf_index = index.dw1.ud; | | brw_inst *send = brw_next_insn(p, BRW_OPCODE_SEND); | brw_set_dest(p, send, retype(dst, BRW_REGISTER_TYPE_UW)); | brw_set_src0(p, send, offset); | brw_set_sampler_message(p, send, | surf_index, | 0, /* LD message ignores sampler unit */ | GEN5_SAMPLER_MESSAGE_SAMPLE_LD, | rlen, | mlen, | false, /* no header */ | simd_mode, | 0); | | brw_mark_surface_used(prog_data, surf_index); | | } else { | | struct brw_reg addr = vec1(retype(brw_address_reg(0), BRW_REGISTER_TYPE_UD)); | | brw_push_insn_state(p); | brw_set_default_mask_control(p, BRW_MASK_DISABLE); | brw_set_default_access_mode(p, BRW_ALIGN_1); | | /* a0.0 = surf_index 0xff */ | brw_inst *insn_and = brw_next_insn(p, BRW_OPCODE_AND); | brw_inst_set_exec_size(p-brw, insn_and, BRW_EXECUTE_1); | brw_set_dest(p, insn_and, addr); | brw_set_src0(p, insn_and, vec1(retype(index, BRW_REGISTER_TYPE_UD))); | brw_set_src1(p, insn_and, brw_imm_ud(0x0ff)); | | | /* a0.0 |= descriptor */ | brw_inst *descr_inst = brw_build_indirect_message_descr(p, addr, addr); | brw_set_sampler_message(p, descr_inst, | 0 /* surface */, | 0 /* sampler */, | GEN5_SAMPLER_MESSAGE_SAMPLE_LD, | rlen /* rlen */, | mlen /* mlen */, | false /* header */, | simd_mode, | 0); | | /* dst = send(offset, a0.0) */ | brw_send_indirect_message(p, BRW_SFID_SAMPLER, dst, offset, addr); | | brw_pop_insn_state(p); | | /* visitor knows more than we do about the surface limit required, | * so has already done marking. | */ | } Which I think could also be written as follows. Or am I missing something again? static brw_inst * brw_build_surface_index_descr(struct brw_compile *p, struct brw_reg dst, index) { brw_set_default_mask_control(p, BRW_MASK_DISABLE); brw_set_default_access_mode(p, BRW_ALIGN_1); /* a0.0 = surf_index 0xff */ brw_inst *insn_and = brw_next_insn(p, BRW_OPCODE_AND); brw_inst_set_exec_size(p-brw, insn_and, BRW_EXECUTE_1); brw_set_dest(p, insn_and, addr); brw_set_src0(p, insn_and, vec1(retype(index, BRW_REGISTER_TYPE_UD))); brw_set_src1(p, insn_and, brw_imm_ud(0x0ff)); /* a0.0 |= descriptor */ brw_inst *descr_inst = brw_build_indirect_message_descr(p, addr, addr); } ... brw_inst *descr_inst; if (index.file == BRW_IMMEDIATE_VALUE) { descr = brw_next_insn(p, BRW_OPCODE_SEND); brw_set_dest(p, send, retype(dst, BRW_REGISTER_TYPE_UW)); brw_set_src0(p, send, offset); brw_mark_surface_used(prog_data, surf_index); } else { struct brw_reg addr = vec1(retype(brw_address_reg(0), BRW_REGISTER_TYPE_UD)); brw_push_insn_state(p); brw_build_surface_index_descr(p, addr, index); /* dst = send(offset, a0.0) */ descr_inst = brw_send_indirect_message(p, BRW_SFID_SAMPLER, dst, offset, addr); brw_pop_insn_state(p); } uint32_t surf_index = index.file == BRW_IMMEDIATE_VALUE ? index.dw1.ud : 0;
Re: [Mesa-dev] [PATCH] i965: Issue perf_debug messages for unsynchronized maps on !LLC systems.
On Tue, Feb 24, 2015 at 09:34:30PM -0800, Kenneth Graunke wrote: We haven't implemented proper unsynchronized map support on !LLC systems (pre-SNB, Atom). MapBufferRange with GL_MAP_UNSYNCHRONIZE_BIT will actually do a synchronized map, probably killing performance. Also warn on BufferSubData, when we should be doing an unsynchronized upload, but instead have to do a synchronous map. v2: Only complain if the buffer is actually busy - we use unsynchronized maps internally for vertex upload and such, but expect those to not be busy. Signed-off-by: Kenneth Graunke kenn...@whitecape.org Sorry I let this slip. I actually thought it was merged, and screwed up the patchwork state. This is both: Tested-by: Ben Widawsky b...@bwidawsk.net Reviewed-by: Ben Widawsky b...@bwidawsk.net My only nit is that I do already have the fix for this. Not sure if we want to pursue that instead. I'll defer to you. -- Ben Widawsky, Intel Open Source Technology Center ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 4/6] i965: Extract blit height max
The blit engine in GEN hardware has constraints. These constraints are a function of tile parameters as well as height. The current code is very dumb in terms of determine max blit parameters. Since we'll be expanding on it, having the abstraction makes things easier. Note that this doesn't change all places which check blitter requirements. I will be adding those as a separate patch(es) since the original series, which was well tested, did not include those. This was requested by Jordan and Jason. Signed-off-by: Ben Widawsky b...@bwidawsk.net --- src/mesa/drivers/dri/i965/intel_blit.c | 12 +--- src/mesa/drivers/dri/i965/intel_blit.h | 10 ++ 2 files changed, 15 insertions(+), 7 deletions(-) diff --git a/src/mesa/drivers/dri/i965/intel_blit.c b/src/mesa/drivers/dri/i965/intel_blit.c index b93794b..c7f4cf3 100644 --- a/src/mesa/drivers/dri/i965/intel_blit.c +++ b/src/mesa/drivers/dri/i965/intel_blit.c @@ -240,14 +240,12 @@ intel_miptree_blit(struct brw_context *brw, dst_x += dst_image_x; dst_y += dst_image_y; - /* The blitter interprets the 16-bit destination x/y as a signed 16-bit -* value. The values we're working with are unsigned, so make sure we don't -* overflow. -*/ - if (src_x = INTEL_MAX_BLIT_PITCH || src_y = INTEL_MAX_BLIT_ROWS || - dst_x = INTEL_MAX_BLIT_PITCH || dst_y = INTEL_MAX_BLIT_ROWS) { + if (src_x = INTEL_MAX_BLIT_PITCH || dst_x = INTEL_MAX_BLIT_PITCH || + src_y = intel_blit_max_height() || + dst_y = intel_blit_max_height()) { perf_debug(Falling back due to =%dk offset [src(%d, %d) dst(%d, %d)]\n, - src_x, src_y, dst_x, dst_y, INTEL_MAX_BLIT_PITCH 20); + src_x, src_y, dst_x, dst_y, + MAX2(intel_blit_max_height(), INTEL_MAX_BLIT_PITCH) 20); return false; } diff --git a/src/mesa/drivers/dri/i965/intel_blit.h b/src/mesa/drivers/dri/i965/intel_blit.h index 531d329..52dd67c 100644 --- a/src/mesa/drivers/dri/i965/intel_blit.h +++ b/src/mesa/drivers/dri/i965/intel_blit.h @@ -78,4 +78,14 @@ void intel_emit_linear_blit(struct brw_context *brw, unsigned int src_offset, unsigned int size); +static inline uint32_t +intel_blit_max_height(void) +{ + /* The docs say that the blitter is capable of transferring 65536 scanlines +* per blit, however the commands we use only have a signed 16b value thus +* making the practical limit 15b. +*/ + return INTEL_MAX_BLIT_ROWS; +} + #endif -- 2.3.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 6/6] i965: Allow Y-tiled allocations for large surfaces
This patch will use a new calculation to determine if a surface can be blitted from or to. Previously, the total_height member was used. Total_height in the case of 2d, 3d, and cube map arrays is the height of each slice/layer/face. Since the GL map APIS only ever deal with a slice at a time however, the determining factor is really the height of one slice. This patch also has a side effect of not needing to set potentially large texture objects to the CPU domain, which implies we do not need to clflush the entire objects. (See references below for a kernel patch to achieve the same thing) With both the Y-tiled surfaces, and the removal of excessive clflushes, this improves the terrain benchmark on Cherryview (data collected by Jordan) Jordan was extremely helpful in creating this patch. Consider him co-author. v2: Remove assertion which didn't belong. This assert was only meant for the patch which renamed gtt mappings to really mean GTT mappings. This should fix around 150 piglit failures Some reworks to centralize blitability determination (Jason, Jordan) v3: Rebased with some non-trivial conflicts on Anuj's blitter fixes. References: http://patchwork.freedesktop.org/patch/38909/ Cc: Jordan Justen jordan.l.jus...@intel.com Signed-off-by: Ben Widawsky b...@bwidawsk.net --- src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 55 +++ src/mesa/drivers/dri/i965/intel_mipmap_tree.h | 21 ++ 2 files changed, 52 insertions(+), 24 deletions(-) diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c index 16bd151..ee8fae4 100644 --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c @@ -86,7 +86,6 @@ compute_msaa_layout(struct brw_context *brw, mesa_format format, GLenum target) } } - /** * For single-sampled render targets (non-MSRT), the MCS buffer is a * scaled-down bitfield representation of the color buffer which is capable of @@ -437,6 +436,12 @@ intel_miptree_create_layout(struct brw_context *brw, return mt; } +static bool +miptree_exceeds_blit_height(struct intel_mipmap_tree *mt) +{ + return intel_miptree_blit_height(mt) = intel_blit_tile_height(mt-tiling); +} + /** * \brief Helper function for intel_miptree_create(). */ @@ -502,10 +507,22 @@ intel_miptree_choose_tiling(struct brw_context *brw, return I915_TILING_NONE; if (ALIGN(minimum_pitch, 512) = INTEL_MAX_BLIT_PITCH || - mt-total_width = INTEL_MAX_BLIT_PITCH || - mt-total_height = INTEL_MAX_BLIT_ROWS) { - perf_debug(%dx%d miptree too large to blit, falling back to untiled, - mt-total_width, mt-total_height); + miptree_exceeds_blit_height(mt)) { + if (mt-format == GL_TEXTURE_3D) { + perf_debug(Unsupported large 3D texture blit. +Falling back to untiled.\n); + } else { + /* qpitch should always be greater than or equal to the tile aligned + * maximum of lod0 height. That is sufficient to determine if we can + * blit, but the most optimal value is potentially less. + */ + if (mt-physical_height0 INTEL_MAX_BLIT_ROWS) { +perf_debug(Potentially skipped a blittable buffers %d\n, + mt-physical_height0); + } + perf_debug(%dx%d miptree too large to blit, falling back to untiled, +mt-total_width, mt-total_height); + } return I915_TILING_NONE; } @@ -647,12 +664,18 @@ intel_miptree_create(struct brw_context *brw, BO_ALLOC_FOR_RENDER : 0)); mt-pitch = pitch; + uint32_t size = + mt-tiling ? ALIGN(intel_miptree_blit_height(mt) * mt-pitch, + intel_blit_tile_height(mt-tiling)) : + mt-bo-size; + assert(size = mt-bo-size); + /* If the BO is too large to fit in the aperture, we need to use the * BLT engine to support it. Prior to Sandybridge, the BLT paths can't * handle Y-tiling, so we need to fall back to X. */ if (brw-gen 6 - mt-bo-size = brw-max_gtt_map_object_size + size = brw-max_gtt_map_object_size mt-tiling == I915_TILING_Y requested_tiling == INTEL_MIPTREE_TILING_ANY) { perf_debug(%dx%d miptree larger than aperture; falling back to X-tiled\n, @@ -2290,23 +2313,7 @@ intel_miptree_release_map(struct intel_mipmap_tree *mt, *map = NULL; } -static bool -can_blit_slice(struct intel_mipmap_tree *mt, - unsigned int level, unsigned int slice) -{ - uint32_t image_x; - uint32_t image_y; - intel_miptree_get_image_offset(mt, level, slice, image_x, image_y); - if (image_x = INTEL_MAX_BLIT_PITCH || image_y = INTEL_MAX_BLIT_ROWS) - return false; - /* See intel_miptree_blit() for details on the 32k pitch limit. */ - if (mt-pitch = INTEL_MAX_BLIT_PITCH) - return false; - - return true; -} - static bool
[Mesa-dev] [PATCH 1/6] i965: Kill y_or_x variable in miptree tiling selection
IMHO, intel_miptree_choose_tiling() is an unfortunate incarnation because it conflates what is permitted vs. what is desirable. This makes doing any sort of fallback operations after the fact somewhat kludgey. The original code basically says: if we requested x XOR y-tiled, and the region aperture; then try to x-tiled Better would be: if we *received* x OR y-tiled, and the region aperture; then try to x-tiled Optimally it is: if we can use either x OR y-tiled and got y-tiled, and the region aperture; then try x tiled This patch actually addresses two potential issues: 1. As far as I can tell, drm_intel_gem_bo_alloc_tiled() which is invariably called, can potentially change the tiling type. Therefore, we shouldn't be checking the requested tiling type, but rather the granted tiling 2. The existing code can fall back to X-tiled even if choose_tiling said X-tiling was not okay. Neither of these are probably actually an issue, but this simply makes the code correct. The changes behavior originally introduced here: commit cbe24ff7c8d69a6d378d9c2cce2bc117517f4302 Author: Kenneth Graunke kenn...@whitecape.org Date: Wed Apr 10 13:49:16 2013 -0700 intel: Fall back to X-tiling when larger than estimated aperture size. v2: This was rebased on a recent commit than Anuj pushed since I originally authored the patch. commit 524a729f68c15da3fc6c42b3158a13e0b84c2728 Author: Anuj Phogat anuj.pho...@gmail.com Date: Tue Feb 17 10:40:58 2015 -0800 i965: Fix condition to use Y tiling in blitter in intel_miptree_create() Cc: Kenneth Graunke kenn...@whitecape.org Cc: Chad Versace chad.vers...@linux.intel.com Cc: Anuj Phogat anuj.pho...@gmail.com Signed-off-by: Ben Widawsky b...@bwidawsk.net --- src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c index 36c3b26..cc422d3 100644 --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c @@ -630,10 +630,8 @@ intel_miptree_create(struct brw_context *brw, uint32_t tiling = intel_miptree_choose_tiling(brw, format, width0, num_samples, requested_tiling, mt); - bool y_or_x = false; if (tiling == (I915_TILING_Y | I915_TILING_X)) { - y_or_x = true; mt-tiling = I915_TILING_Y; } else { mt-tiling = tiling; @@ -652,7 +650,10 @@ intel_miptree_create(struct brw_context *brw, * BLT engine to support it. Prior to Sandybridge, the BLT paths can't * handle Y-tiling, so we need to fall back to X. */ - if (brw-gen 6 y_or_x mt-bo-size = brw-max_gtt_map_object_size) { + if (brw-gen 6 + mt-bo-size = brw-max_gtt_map_object_size + mt-tiling == I915_TILING_Y + requested_tiling == INTEL_MIPTREE_TILING_ANY) { perf_debug(%dx%d miptree larger than aperture; falling back to X-tiled\n, mt-total_width, mt-total_height); -- 2.3.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/6] i965: Fix comments about blit constraints
The spec does say that the blitter is capable of transferring 64k scanlines in a single blit operation. Perhaps this was true, or is still true on some operations, but for all commands that we use, we are restricted to 16b signed: For example, from the XY_SRC_COPY_CHROMA_BLT definition: Destination Y2 Coordinate (Bottom) 16 bit signed number. Signed-off-by: Ben Widawsky b...@bwidawsk.net --- src/mesa/drivers/dri/i965/intel_blit.c | 8 ++-- src/mesa/drivers/dri/i965/intel_copy_image.c | 8 ++-- 2 files changed, 12 insertions(+), 4 deletions(-) diff --git a/src/mesa/drivers/dri/i965/intel_blit.c b/src/mesa/drivers/dri/i965/intel_blit.c index 9500bd7..ee2a4ef 100644 --- a/src/mesa/drivers/dri/i965/intel_blit.c +++ b/src/mesa/drivers/dri/i965/intel_blit.c @@ -190,11 +190,15 @@ intel_miptree_blit(struct brw_context *brw, *The BLT engine is capable of transferring very large quantities of *graphics data. Any graphics data read from and written to the *destination is permitted to represent a number of pixels that -*occupies up to 65,536 scan lines and up to 32,768 bytes per scan line -*at the destination. The maximum number of pixels that may be +*occupies up to 65,536 [sic] scan lines and up to 32,768 bytes per scan +*line at the destination. The maximum number of pixels that may be *represented per scan line’s worth of graphics data depends on the *color depth. * +* XXX: The spec is likely incorrect. The number of scanlines is represented +* in the blit command as a 16b signed number, thus 32,767 as the max number +* of scanlines. +* * Furthermore, intelEmitCopyBlit (which is called below) uses a signed * 16-bit integer to represent buffer pitch, so it can only handle buffer * pitches 32k. diff --git a/src/mesa/drivers/dri/i965/intel_copy_image.c b/src/mesa/drivers/dri/i965/intel_copy_image.c index f4c7eff..bf6b5e7 100644 --- a/src/mesa/drivers/dri/i965/intel_copy_image.c +++ b/src/mesa/drivers/dri/i965/intel_copy_image.c @@ -53,11 +53,15 @@ copy_image_with_blitter(struct brw_context *brw, *The BLT engine is capable of transferring very large quantities of *graphics data. Any graphics data read from and written to the *destination is permitted to represent a number of pixels that -*occupies up to 65,536 scan lines and up to 32,768 bytes per scan line -*at the destination. The maximum number of pixels that may be +*occupies up to 65,536 [sic] scan lines and up to 32,768 bytes per scan +*line at the destination. The maximum number of pixels that may be *represented per scan line’s worth of graphics data depends on the *color depth. * +* XXX: The spec is likely incorrect. The number of scanlines is represented +* in the blit command as a 16b signed number, thus 32,767 as the max number +* of scanlines. +* * Furthermore, intelEmitCopyBlit (which is called below) uses a signed * 16-bit integer to represent buffer pitch, so it can only handle buffer * pitches 32k. -- 2.3.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 5/6] i965: Attempt to blit for larger textures
The blit engine is limited to 32Kx32K transfer. In cases where we have to fall back to the blitter, and when trying to blit a slice of a 2d texture array, or face of a cube map, we don't need to transfer the entire texture. I doubt this patch will get exercised at this point since we'll always allocate a linear BO for huge buffers. The next patch changes that. v2: Fix NDEBUG warning v3: Rebased with new blit computation function. Modify computation to account of tiling constraints (Jason, Jordan) Use the new computation function in y adjust function (Jason, Jordan) Dropped slice parameter from the y adjusting function (~Jason) Add assert that adjusted y offset is within bounds Renamed and moved the helper functions public in intel_blit.h v3.1: Fixed assertion fail from v3 (Jordan) Remove conditional y adjusted calculation, replace with comment (Jordan + Jason) Signed-off-by: Ben Widawsky b...@bwidawsk.net --- src/mesa/drivers/dri/i965/intel_blit.c | 101 +++-- src/mesa/drivers/dri/i965/intel_blit.h | 24 +++- 2 files changed, 118 insertions(+), 7 deletions(-) diff --git a/src/mesa/drivers/dri/i965/intel_blit.c b/src/mesa/drivers/dri/i965/intel_blit.c index c7f4cf3..832dad1 100644 --- a/src/mesa/drivers/dri/i965/intel_blit.c +++ b/src/mesa/drivers/dri/i965/intel_blit.c @@ -130,6 +130,92 @@ set_blitter_tiling(struct brw_context *brw, ADVANCE_BATCH(); \ } while (0) +/* This function returns the offset to be used by the blit operation. It may + * modify the y if the texture would otherwise fail to be able to perform a + * blit. The x offset will not need to change based on the computations made by + * this function. + * + * By the time we get to this function, the miptree creation code should have + * already determined it's possible to blit the texture, so there should never + * be a case where this function fails. + */ +static GLuint +intel_miptree_get_adjusted_y_offset(struct intel_mipmap_tree *mt, uint32_t *y) +{ + GLuint offset = mt-offset; + + /* Convert an input number of rows: y into 2 values: an offset (page aligned +* in byte units), and the remaining rows of y. The resulting 2 values will +* be used as parameters for a blit operation [using the HW blit engine]. +* They will therefore conform to whatever restrictions are needed. +* +* XXX: This code assumes that LOD0 is always guaranteed to be properly +* aligned for the blit operation. The round down only mutates y if the LOD +* being adjusted isn't tile aligned. In other words, if input y is pointing +* to LOD0 of a slice, the adjusted y should always be 0. Similarly if input +* y is pointing to another LOD, and the offset happens to be tile aligned, y +* will again be 0. +* +* The following diagram shows how the blit parameters are modified. In the +* example, is is trying to blit with LOD1 from slice[x] as a surface, and +* LOD1 is not properly tile aligned. TA means tile aligned. The rectangle +* is the BO that contains the mipmaps. There may be an offset from the start +* of the BO to the first slice. +* +* INPUT OUTPUT +* 0+---+ +*| |+---+ +* offset | slice[0]...slice[x-2]| offset | +--+ | +*| || | lod0| slice[x]| +* TA | +--+ || | | | +*| | lod0| slice[x-1] || +--+ | +*| | | | y--- | +---+ +-+| +*| +--+ || | | +-+| +*| +---+ +-+|| +---+ * | +*| | | +-+|| | +*| +---+ * || slice[x+1]...| +*| |+---+ +*| // qpitch padding| +*| | +* TA | +--+ | +*| | lod0| slice[x]| +*| | | | +*| +--+ | +* y--- | +---+ +-+| +*| | | +-+| +*| +---+ * | +*| | +*| slice[x+1]...| +*+---+ +*/ + + /* The following calculation looks fancy. In the common case, slice == 0 +* and/or the full mipmap fits within blitter constraints, it should be +* equivalent to the simple: +* return offset; +*/ + const long TILE_MASK = + mt-tiling
[Mesa-dev] [PATCH 0/6] blitter improvement patches
With the direct PBO upload, I guess the main thing this series offers (other than cleanups), is we can now allocate large BOs as Y-tiled since the code permits them to be blitted. Originally, the patch series did enable the use of blitter more often, and it resulted in some huge perf gains. There are still gains though. I've lost the most recent performance data I had. It was around 10% on terrain, and nothing else outstanding. If perf data is a blocker for merge, I will try to get to it eventually, but it will be a while. This is the reworked version of that original series that boosted terrain performance. After the review feedback, which were essentially cleanup type things, I decided to take it a stuck further. Since the original work I had to resolve some conflicts with a patch series Anuj landed to rework some of the blitter code (since I also do that). It's possible I added bugs while doing that, but piglit does seem happy. It's been sitting idle for a long time for 2 reasons: 1. After the initial revision, I had a bug in the code which regressed performance. Ken helped me find that, and now it goes back to a performance win. 2. I was noticing intermittent GL_OUT_OF_MEMORY errors which I was too lazy to track down since I assumed it wasn't my fault. Well, I finally tracked it down and fixed it: commit 7aba4ab1f355ea1a5870b3deb4b295565132dfc5 Author: Ben Widawsky benjamin.widaw...@intel.com Date: Fri Mar 6 17:31:00 2015 -0800 meta: Plug memory leak Ben Widawsky (6): i965: Kill y_or_x variable in miptree tiling selection i965: Fix comments about blit constraints i965: Create and use #defines for blitter constraints i965: Extract blit height max i965: Attempt to blit for larger textures i965: Allow Y-tiled allocations for large surfaces src/mesa/drivers/dri/i965/intel_blit.c| 120 +++--- src/mesa/drivers/dri/i965/intel_blit.h| 33 +++ src/mesa/drivers/dri/i965/intel_copy_image.c | 15 ++-- src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 61 +++-- src/mesa/drivers/dri/i965/intel_mipmap_tree.h | 21 + 5 files changed, 206 insertions(+), 44 deletions(-) -- 2.3.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/6] i965: Create and use #defines for blitter constraints
Signed-off-by: Ben Widawsky b...@bwidawsk.net --- src/mesa/drivers/dri/i965/intel_blit.c| 11 ++- src/mesa/drivers/dri/i965/intel_blit.h| 3 +++ src/mesa/drivers/dri/i965/intel_copy_image.c | 7 --- src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 9 + 4 files changed, 18 insertions(+), 12 deletions(-) diff --git a/src/mesa/drivers/dri/i965/intel_blit.c b/src/mesa/drivers/dri/i965/intel_blit.c index ee2a4ef..b93794b 100644 --- a/src/mesa/drivers/dri/i965/intel_blit.c +++ b/src/mesa/drivers/dri/i965/intel_blit.c @@ -206,8 +206,8 @@ intel_miptree_blit(struct brw_context *brw, * As a result of these two limitations, we can only use the blitter to do * this copy when the miptree's pitch is less than 32k. */ - if (src_mt-pitch = 32768 || - dst_mt-pitch = 32768) { + if (src_mt-pitch = INTEL_MAX_BLIT_PITCH || + dst_mt-pitch = INTEL_MAX_BLIT_PITCH) { perf_debug(Falling back due to =32k pitch\n); return false; } @@ -244,9 +244,10 @@ intel_miptree_blit(struct brw_context *brw, * value. The values we're working with are unsigned, so make sure we don't * overflow. */ - if (src_x = 32768 || src_y = 32768 || dst_x = 32768 || dst_y = 32768) { - perf_debug(Falling back due to =32k offset [src(%d, %d) dst(%d, %d)]\n, - src_x, src_y, dst_x, dst_y); + if (src_x = INTEL_MAX_BLIT_PITCH || src_y = INTEL_MAX_BLIT_ROWS || + dst_x = INTEL_MAX_BLIT_PITCH || dst_y = INTEL_MAX_BLIT_ROWS) { + perf_debug(Falling back due to =%dk offset [src(%d, %d) dst(%d, %d)]\n, + src_x, src_y, dst_x, dst_y, INTEL_MAX_BLIT_PITCH 20); return false; } diff --git a/src/mesa/drivers/dri/i965/intel_blit.h b/src/mesa/drivers/dri/i965/intel_blit.h index f563939..531d329 100644 --- a/src/mesa/drivers/dri/i965/intel_blit.h +++ b/src/mesa/drivers/dri/i965/intel_blit.h @@ -30,6 +30,9 @@ #include brw_context.h +#define INTEL_MAX_BLIT_PITCH 32768 +#define INTEL_MAX_BLIT_ROWS 32768 + bool intelEmitCopyBlit(struct brw_context *brw, GLuint cpp, diff --git a/src/mesa/drivers/dri/i965/intel_copy_image.c b/src/mesa/drivers/dri/i965/intel_copy_image.c index bf6b5e7..689e9c7 100644 --- a/src/mesa/drivers/dri/i965/intel_copy_image.c +++ b/src/mesa/drivers/dri/i965/intel_copy_image.c @@ -69,9 +69,10 @@ copy_image_with_blitter(struct brw_context *brw, * As a result of these two limitations, we can only use the blitter to do * this copy when the miptree's pitch is less than 32k. */ - if (src_mt-pitch = 32768 || - dst_mt-pitch = 32768) { - perf_debug(Falling back due to =32k pitch\n); + if (src_mt-pitch = INTEL_MAX_BLIT_PITCH || + dst_mt-pitch = INTEL_MAX_BLIT_PITCH) { + perf_debug(Falling back due to =%dk pitch\n, + INTEL_MAX_BLIT_PITCH 20); return false; } diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c index cc422d3..16bd151 100644 --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c @@ -501,8 +501,9 @@ intel_miptree_choose_tiling(struct brw_context *brw, if (minimum_pitch 64) return I915_TILING_NONE; - if (ALIGN(minimum_pitch, 512) = 32768 || - mt-total_width = 32768 || mt-total_height = 32768) { + if (ALIGN(minimum_pitch, 512) = INTEL_MAX_BLIT_PITCH || + mt-total_width = INTEL_MAX_BLIT_PITCH || + mt-total_height = INTEL_MAX_BLIT_ROWS) { perf_debug(%dx%d miptree too large to blit, falling back to untiled, mt-total_width, mt-total_height); return I915_TILING_NONE; @@ -2296,11 +2297,11 @@ can_blit_slice(struct intel_mipmap_tree *mt, uint32_t image_x; uint32_t image_y; intel_miptree_get_image_offset(mt, level, slice, image_x, image_y); - if (image_x = 32768 || image_y = 32768) + if (image_x = INTEL_MAX_BLIT_PITCH || image_y = INTEL_MAX_BLIT_ROWS) return false; /* See intel_miptree_blit() for details on the 32k pitch limit. */ - if (mt-pitch = 32768) + if (mt-pitch = INTEL_MAX_BLIT_PITCH) return false; return true; -- 2.3.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 6/6] i965/skl: Don't use ALL_SLICES_AT_EACH_LOD
On Fri, Feb 20, 2015 at 10:31:08PM +, Neil Roberts wrote: The render surface state command for Skylake doesn't have the surface array spacing bit so I don't think it's possible to select this layout. This avoids a kernel panic when running the piglit test below: Kernel panic!? Please, go on. We cannot cause kernel panics from userspace. It's a kernel bug if we do. ext_framebuffer_multisample-no-color 8 stencil single However the test still fails so there may be something else wrong as well. The test was not causing a kernel panic before the patch to fix the qpitch. I think it's also not possible to select this layout on Gen8 so it may need to be changed to only be used on Gen7. I don't think this is the right answer. The array spacing bit goes away because we can manually specify the qpitch (I think). We should probably dig into this a bit more. I can help if you'd like. --- src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 26 -- 1 file changed, 16 insertions(+), 10 deletions(-) diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c index 994670a..018e16b 100644 --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c @@ -371,19 +371,25 @@ intel_miptree_create_layout(struct brw_context *brw, } } - /* Set array_layout to ALL_SLICES_AT_EACH_LOD when gen7+ array_spacing_lod0 -* can be used. array_spacing_lod0 is only used for non-IMS MSAA surfaces. + /* Set array_layout to ALL_SLICES_AT_EACH_LOD when array_spacing_lod0 +* can be used. array_spacing_lod0 is only used for non-IMS MSAA surfaces +* on Gen 7 and 8. * TODO: can we use it elsewhere? +* TODO: does this actually work on Gen 8? */ - switch (mt-msaa_layout) { - case INTEL_MSAA_LAYOUT_NONE: - case INTEL_MSAA_LAYOUT_IMS: + if (brw-gen = 9) { mt-array_layout = ALL_LOD_IN_EACH_SLICE; - break; - case INTEL_MSAA_LAYOUT_UMS: - case INTEL_MSAA_LAYOUT_CMS: - mt-array_layout = ALL_SLICES_AT_EACH_LOD; - break; + } else { + switch (mt-msaa_layout) { + case INTEL_MSAA_LAYOUT_NONE: + case INTEL_MSAA_LAYOUT_IMS: + mt-array_layout = ALL_LOD_IN_EACH_SLICE; + break; + case INTEL_MSAA_LAYOUT_UMS: + case INTEL_MSAA_LAYOUT_CMS: + mt-array_layout = ALL_SLICES_AT_EACH_LOD; + break; + } } if (target == GL_TEXTURE_CUBE_MAP) { -- Ben Widawsky, Intel Open Source Technology Center ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/4] mesa: Check for valid PBO access in gl(Compressed)Tex(Sub)Image calls
Looks good to me. Reviewed-by: Laura Ekstrand la...@jlekstrand.net On Thu, Mar 5, 2015 at 12:20 AM, Eduardo Lima Mitev el...@igalia.com wrote: This patch adds two types of checks to the gl(Compressed)Tex(Sub)Imgage family of functions when a pixel buffer object is bound to GL_PIXEL_UNPACK_BUFFER: - That the buffer is not mapped. - The total data size is within the boundaries of the buffer size. It does so by calling auxiliary validations functions from PBO API: _mesa_validate_pbo_source() for non-compressed texture calls, and _mesa_validate_pbo_source_compressed() for compressed texture calls. The first check is defined in Section 6.3.2 'Effects of Mapping Buffers on Other GL Commands' of the GLES 3.1 spec, page 57: Any GL command which attempts to read from, write to, or change the state of a buffer object may generate an INVALID_OPERATION error if all or part of the buffer object is mapped. However, only commands which explicitly describe this error are required to do so. If an error is not generated, using such commands to perform invalid reads, writes, or state changes will have undefined results and may result in GL interruption or termination. Similar wording exists in GL 4.5 spec, page 76. In the case of gl(Compressed)Tex(Sub)Image(2,3)D, the specification doesn't force implemtations to throw an error. However since Mesa don't currently implement checks to determine when it is safe to read/write from/to a mapped PBO, we should always return the error if all or parts of it are mapped. The 2nd check is defined in Section 8.5 'Texture Image Specification' of the OpenGL 4.5 spec, page 203: An INVALID_OPERATION error is generated if a pixel unpack buffer object is bound and storing texture data would access memory beyond the end of the pixel unpack buffer. Fixes 4 dEQP tests: * dEQP-GLES3.functional.negative_api.texture.compressedteximage2d_invalid_buffer_target * dEQP-GLES3.functional.negative_api.texture.compressedtexsubimage2d_invalid_buffer_target * dEQP-GLES3.functional.negative_api.texture.compressedteximage3d_invalid_buffer_target * dEQP-GLES3.functional.negative_api.texture.compressedtexsubimage3d_invalid_buffer_target --- src/mesa/main/teximage.c | 51 +++- 1 file changed, 42 insertions(+), 9 deletions(-) diff --git a/src/mesa/main/teximage.c b/src/mesa/main/teximage.c index abcafde..f239e39 100644 --- a/src/mesa/main/teximage.c +++ b/src/mesa/main/teximage.c @@ -53,6 +53,7 @@ #include mtypes.h #include glformats.h #include texstore.h +#include pbo.h /** @@ -2110,7 +2111,8 @@ texture_error_check( struct gl_context *ctx, GLint level, GLint internalFormat, GLenum format, GLenum type, GLint width, GLint height, - GLint depth, GLint border ) + GLint depth, GLint border, + const GLvoid *pixels ) { GLenum err; @@ -2195,6 +2197,13 @@ texture_error_check( struct gl_context *ctx, return GL_TRUE; } + /* validate the bound PBO, if any */ + if (!_mesa_validate_pbo_source(ctx, dimensions, ctx-Unpack, + width, height, depth, format, type, + INT_MAX, pixels, glTexImage)) { + return GL_TRUE; + } + /* make sure internal format and format basically agree */ if (!texture_formats_agree(internalFormat, format)) { _mesa_error(ctx, GL_INVALID_OPERATION, @@ -2291,7 +2300,7 @@ compressed_texture_error_check(struct gl_context *ctx, GLint dimensions, GLenum target, GLint level, GLenum internalFormat, GLsizei width, GLsizei height, GLsizei depth, GLint border, - GLsizei imageSize) + GLsizei imageSize, const GLvoid *data) { const GLint maxLevels = _mesa_max_texture_levels(ctx, target); GLint expectedSize; @@ -2319,6 +2328,12 @@ compressed_texture_error_check(struct gl_context *ctx, GLint dimensions, return GL_TRUE; } + /* validate the bound PBO, if any */ + if (!_mesa_validate_pbo_source_compressed(ctx, dimensions, ctx-Unpack, + imageSize, data, glCompressedTexImage)) { + return GL_TRUE; + } + switch (internalFormat) { case GL_PALETTE4_RGB8_OES: case GL_PALETTE4_RGBA8_OES: @@ -2451,7 +2466,8 @@ texsubimage_error_check(struct gl_context *ctx, GLuint dimensions, GLenum target, GLint level, GLint xoffset, GLint yoffset, GLint zoffset, GLint width, GLint height, GLint depth, -GLenum format, GLenum type, bool dsa) +GLenum
Re: [Mesa-dev] [PATCH 1/2] util: Fix foreach_list_typed_safe when exec_node is not at offset 0.
Reviewed-by: Connor Abbott cwabbo...@gmail.com On Mon, Mar 9, 2015 at 9:36 PM, Kenneth Graunke kenn...@whitecape.org wrote: From: Jason Ekstrand jason.ekstr...@intel.com __next and __prev are pointers to the structure containing the exec_node link, not the embedded exec_node. NULL checks would fail unless the embedded exec_node happened to be at offset 0 in the parent struct. Signed-off-by: Jason Ekstrand jason.ekstr...@intel.com Reviewed-by: Kenneth Graunke kenn...@whitecape.org --- src/glsl/list.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/glsl/list.h b/src/glsl/list.h index ddb98f7..680e963 100644 --- a/src/glsl/list.h +++ b/src/glsl/list.h @@ -684,7 +684,7 @@ inline void exec_node::insert_before(exec_list *before) exec_node_data(__type, (__list)-head, __field),\ * __next = \ exec_node_data(__type, (__node)-__field.next, __field);\ -__next != NULL;\ +__next-__field != NULL; \ __node = __next, __next = \ exec_node_data(__type, (__next)-__field.next, __field)) @@ -693,7 +693,7 @@ inline void exec_node::insert_before(exec_list *before) exec_node_data(__type, (__list)-tail_pred, __field), \ * __prev = \ exec_node_data(__type, (__node)-__field.prev, __field);\ -__prev != NULL;\ +__prev-__field != NULL; \ __node = __prev, __prev = \ exec_node_data(__type, (__prev)-__field.prev, __field)) -- 2.2.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] nir: Fix non-determinism in nir_lower_vars_to_ssa().
Push it! On Mar 9, 2015 7:03 PM, Connor Abbott cwabbo...@gmail.com wrote: Reviewed-by: Connor Abbott cwabbo...@gmail.com I was in the middle of rewriting this pass for making derefs instructions, which hasn't been going nearly as nicely as I would like (ugh...), so if it pans out then I'll have to think about it a little more to make sure the new version is deterministic too. On Mon, Mar 9, 2015 at 9:36 PM, Kenneth Graunke kenn...@whitecape.org wrote: Previously, we stored derefs in a hash table, using the malloc'd pointer as the key. Then, we walked through the hash table and generated code, based on the order of the hash table's elements. Memory addresses returned by malloc are pretty much random, which meant that the hash was random, and the hash table's elements would be walked in some random order. This led to successive compiles of the same shader using different variable names and slightly different orderings of phi-nodes. Code could not be diff'd, and the final assembly would sometimes change slightly too. It turns out the only point of the hash table was to avoid inserting the same node multiple times for different dereferences. We never actually searched the hash table! This patch uses an intrusive linked list instead. Since exec_list uses head and tail sentinels, checking prev or next against NULL will tell us whether the node is already in the list. Pair programming with Jason Ekstrand. Signed-off-by: Jason Ekstrand jason.ekstr...@intel.com Signed-off-by: Kenneth Graunke kenn...@whitecape.org --- src/glsl/nir/nir_lower_vars_to_ssa.c | 123 --- 1 file changed, 26 insertions(+), 97 deletions(-) diff --git a/src/glsl/nir/nir_lower_vars_to_ssa.c b/src/glsl/nir/nir_lower_vars_to_ssa.c index 9e9a418..86e6ab4 100644 --- a/src/glsl/nir/nir_lower_vars_to_ssa.c +++ b/src/glsl/nir/nir_lower_vars_to_ssa.c @@ -35,6 +35,13 @@ struct deref_node { bool lower_to_ssa; + /* Only valid for things that end up in the direct list. +* Note that multiple nir_deref_vars may correspond to this node, but they +* will all be equivalent, so any is as good as the other. +*/ + nir_deref_var *deref; + struct exec_node direct_derefs_link; + struct set *loads; struct set *stores; struct set *copies; @@ -69,7 +76,7 @@ struct lower_variables_state { * wildcards and no indirects, these are precisely the derefs that we * can actually consider lowering. */ - struct hash_table *direct_deref_nodes; + struct exec_list direct_deref_nodes; /* Controls whether get_deref_node will add variables to the * direct_deref_nodes table. This is turned on when we are initially @@ -83,88 +90,6 @@ struct lower_variables_state { struct hash_table *phi_table; }; -/* The following two functions implement a hash and equality check for - * variable dreferences. When the hash or equality function encounters an - * array, all indirects are treated as equal and are never equal to a - * direct dereference or a wildcard. - */ -static uint32_t -hash_deref(const void *void_deref) -{ - uint32_t hash = _mesa_fnv32_1a_offset_bias; - - const nir_deref_var *deref_var = void_deref; - hash = _mesa_fnv32_1a_accumulate(hash, deref_var-var); - - for (const nir_deref *deref = deref_var-deref.child; -deref; deref = deref-child) { - switch (deref-deref_type) { - case nir_deref_type_array: { - nir_deref_array *deref_array = nir_deref_as_array(deref); - - hash = _mesa_fnv32_1a_accumulate(hash, deref_array-deref_array_type); - - if (deref_array-deref_array_type == nir_deref_array_type_direct) -hash = _mesa_fnv32_1a_accumulate(hash, deref_array-base_offset); - break; - } - case nir_deref_type_struct: { - nir_deref_struct *deref_struct = nir_deref_as_struct(deref); - hash = _mesa_fnv32_1a_accumulate(hash, deref_struct-index); - break; - } - default: - assert(Invalid deref chain); - } - } - - return hash; -} - -static bool -derefs_equal(const void *void_a, const void *void_b) -{ - const nir_deref_var *a_var = void_a; - const nir_deref_var *b_var = void_b; - - if (a_var-var != b_var-var) - return false; - - for (const nir_deref *a = a_var-deref.child, *b = b_var-deref.child; -a != NULL; a = a-child, b = b-child) { - if (a-deref_type != b-deref_type) - return false; - - switch (a-deref_type) { - case nir_deref_type_array: { - nir_deref_array *a_arr = nir_deref_as_array(a); - nir_deref_array *b_arr = nir_deref_as_array(b); - - if (a_arr-deref_array_type != b_arr-deref_array_type) -return false; - - if
Re: [Mesa-dev] [PATCH 2/2] nir: Fix non-determinism in nir_lower_vars_to_ssa().
On Mon, Mar 9, 2015 at 7:28 PM, Matt Turner matts...@gmail.com wrote: On Mon, Mar 9, 2015 at 7:04 PM, Jason Ekstrand ja...@jlekstrand.net wrote: Push it! Our policy is to wait a day for most things. Sure. Not really arguing for early pushing. Mostly just surprised that Connor picked up on it that fast. --Jason ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] util: Fix foreach_list_typed_safe when exec_node is not at offset 0.
On Mon, Mar 9, 2015 at 7:24 PM, Matt Turner matts...@gmail.com wrote: On Mon, Mar 9, 2015 at 6:36 PM, Kenneth Graunke kenn...@whitecape.org wrote: From: Jason Ekstrand jason.ekstr...@intel.com __next and __prev are pointers to the structure containing the exec_node link, not the embedded exec_node. NULL checks would fail unless the embedded exec_node happened to be at offset 0 in the parent struct. Signed-off-by: Jason Ekstrand jason.ekstr...@intel.com Reviewed-by: Kenneth Graunke kenn...@whitecape.org --- src/glsl/list.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/glsl/list.h b/src/glsl/list.h index ddb98f7..680e963 100644 --- a/src/glsl/list.h +++ b/src/glsl/list.h @@ -684,7 +684,7 @@ inline void exec_node::insert_before(exec_list *before) exec_node_data(__type, (__list)-head, __field),\ * __next = \ exec_node_data(__type, (__node)-__field.next, __field);\ -__next != NULL;\ +__next-__field != NULL; \ I'm not understanding now the address of __next-__field can ever be NULL. __next is something with an embedded struct exec_node, so don't we want __next-__field != NULL without the address-of operator? Sorry, that should have been exec_node_is_tail_sentinel(__next-__field) ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] util: Fix foreach_list_typed_safe when exec_node is not at offset 0.
On Mon, Mar 9, 2015 at 7:58 PM, Matt Turner matts...@gmail.com wrote: On Mon, Mar 9, 2015 at 7:32 PM, Jason Ekstrand ja...@jlekstrand.net wrote: On Mon, Mar 9, 2015 at 7:24 PM, Matt Turner matts...@gmail.com wrote: On Mon, Mar 9, 2015 at 6:36 PM, Kenneth Graunke kenn...@whitecape.org wrote: From: Jason Ekstrand jason.ekstr...@intel.com __next and __prev are pointers to the structure containing the exec_node link, not the embedded exec_node. NULL checks would fail unless the embedded exec_node happened to be at offset 0 in the parent struct. Signed-off-by: Jason Ekstrand jason.ekstr...@intel.com Reviewed-by: Kenneth Graunke kenn...@whitecape.org --- src/glsl/list.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/glsl/list.h b/src/glsl/list.h index ddb98f7..680e963 100644 --- a/src/glsl/list.h +++ b/src/glsl/list.h @@ -684,7 +684,7 @@ inline void exec_node::insert_before(exec_list *before) exec_node_data(__type, (__list)-head, __field), \ * __next = \ exec_node_data(__type, (__node)-__field.next, __field); \ -__next != NULL; \ +__next-__field != NULL; \ I'm not understanding now the address of __next-__field can ever be NULL. __next is something with an embedded struct exec_node, so don't we want __next-__field != NULL without the address-of operator? No, __field is the name of the exec_node field embedded in the __type struct. So if I have struct foo { /* stuff */ struct exec_node bar; } as my type, __type is struct foo, __next and __node are both of type __type *, and __field is bar. So, in order to get form __next to an exec_node, you have to do __next-__field because we need the actual exec_node back. More concretely, for something like: foreach_list_typed_safe (bblock_link, successor, link, predecessor-block-children) I think this macro expands to (after this patch) for (bblock_link * successor = exec_node_data(bblock_link, (predecessor-block-children)-head, link), * __next = exec_node_data(bblock_link, (successor)-link.next, link); __next-link != NULL; successor = __next, __next = exec_node_data(bblock_link, (__next)-link.next, link)) How can the address of a field in a struct (e.g., __next-link) be NULL? If the address of the struct is (void *)-8 and the link field is 8 bytes into the struct. In this case, assuming unsigned overflow (I think that's reasonably safe), the address of the link will be (void *)0 --Jason ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] nir: Fix non-determinism in nir_lower_vars_to_ssa().
Previously, we stored derefs in a hash table, using the malloc'd pointer as the key. Then, we walked through the hash table and generated code, based on the order of the hash table's elements. Memory addresses returned by malloc are pretty much random, which meant that the hash was random, and the hash table's elements would be walked in some random order. This led to successive compiles of the same shader using different variable names and slightly different orderings of phi-nodes. Code could not be diff'd, and the final assembly would sometimes change slightly too. It turns out the only point of the hash table was to avoid inserting the same node multiple times for different dereferences. We never actually searched the hash table! This patch uses an intrusive linked list instead. Since exec_list uses head and tail sentinels, checking prev or next against NULL will tell us whether the node is already in the list. Pair programming with Jason Ekstrand. Signed-off-by: Jason Ekstrand jason.ekstr...@intel.com Signed-off-by: Kenneth Graunke kenn...@whitecape.org --- src/glsl/nir/nir_lower_vars_to_ssa.c | 123 --- 1 file changed, 26 insertions(+), 97 deletions(-) diff --git a/src/glsl/nir/nir_lower_vars_to_ssa.c b/src/glsl/nir/nir_lower_vars_to_ssa.c index 9e9a418..86e6ab4 100644 --- a/src/glsl/nir/nir_lower_vars_to_ssa.c +++ b/src/glsl/nir/nir_lower_vars_to_ssa.c @@ -35,6 +35,13 @@ struct deref_node { bool lower_to_ssa; + /* Only valid for things that end up in the direct list. +* Note that multiple nir_deref_vars may correspond to this node, but they +* will all be equivalent, so any is as good as the other. +*/ + nir_deref_var *deref; + struct exec_node direct_derefs_link; + struct set *loads; struct set *stores; struct set *copies; @@ -69,7 +76,7 @@ struct lower_variables_state { * wildcards and no indirects, these are precisely the derefs that we * can actually consider lowering. */ - struct hash_table *direct_deref_nodes; + struct exec_list direct_deref_nodes; /* Controls whether get_deref_node will add variables to the * direct_deref_nodes table. This is turned on when we are initially @@ -83,88 +90,6 @@ struct lower_variables_state { struct hash_table *phi_table; }; -/* The following two functions implement a hash and equality check for - * variable dreferences. When the hash or equality function encounters an - * array, all indirects are treated as equal and are never equal to a - * direct dereference or a wildcard. - */ -static uint32_t -hash_deref(const void *void_deref) -{ - uint32_t hash = _mesa_fnv32_1a_offset_bias; - - const nir_deref_var *deref_var = void_deref; - hash = _mesa_fnv32_1a_accumulate(hash, deref_var-var); - - for (const nir_deref *deref = deref_var-deref.child; -deref; deref = deref-child) { - switch (deref-deref_type) { - case nir_deref_type_array: { - nir_deref_array *deref_array = nir_deref_as_array(deref); - - hash = _mesa_fnv32_1a_accumulate(hash, deref_array-deref_array_type); - - if (deref_array-deref_array_type == nir_deref_array_type_direct) -hash = _mesa_fnv32_1a_accumulate(hash, deref_array-base_offset); - break; - } - case nir_deref_type_struct: { - nir_deref_struct *deref_struct = nir_deref_as_struct(deref); - hash = _mesa_fnv32_1a_accumulate(hash, deref_struct-index); - break; - } - default: - assert(Invalid deref chain); - } - } - - return hash; -} - -static bool -derefs_equal(const void *void_a, const void *void_b) -{ - const nir_deref_var *a_var = void_a; - const nir_deref_var *b_var = void_b; - - if (a_var-var != b_var-var) - return false; - - for (const nir_deref *a = a_var-deref.child, *b = b_var-deref.child; -a != NULL; a = a-child, b = b-child) { - if (a-deref_type != b-deref_type) - return false; - - switch (a-deref_type) { - case nir_deref_type_array: { - nir_deref_array *a_arr = nir_deref_as_array(a); - nir_deref_array *b_arr = nir_deref_as_array(b); - - if (a_arr-deref_array_type != b_arr-deref_array_type) -return false; - - if (a_arr-deref_array_type == nir_deref_array_type_direct - a_arr-base_offset != b_arr-base_offset) -return false; - break; - } - case nir_deref_type_struct: - if (nir_deref_as_struct(a)-index != nir_deref_as_struct(b)-index) -return false; - break; - default: - assert(Invalid deref chain); - return false; - } - - assert((a-child == NULL) == (b-child == NULL)); - if((a-child == NULL) != (b-child == NULL)) - return false; - } - - return true; -} - static int type_get_length(const struct glsl_type *type) { @@ -195,6 +120,8 @@ deref_node_create(struct deref_node *parent, struct
[Mesa-dev] [PATCH 1/2] util: Fix foreach_list_typed_safe when exec_node is not at offset 0.
From: Jason Ekstrand jason.ekstr...@intel.com __next and __prev are pointers to the structure containing the exec_node link, not the embedded exec_node. NULL checks would fail unless the embedded exec_node happened to be at offset 0 in the parent struct. Signed-off-by: Jason Ekstrand jason.ekstr...@intel.com Reviewed-by: Kenneth Graunke kenn...@whitecape.org --- src/glsl/list.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/glsl/list.h b/src/glsl/list.h index ddb98f7..680e963 100644 --- a/src/glsl/list.h +++ b/src/glsl/list.h @@ -684,7 +684,7 @@ inline void exec_node::insert_before(exec_list *before) exec_node_data(__type, (__list)-head, __field),\ * __next = \ exec_node_data(__type, (__node)-__field.next, __field);\ -__next != NULL;\ +__next-__field != NULL; \ __node = __next, __next = \ exec_node_data(__type, (__next)-__field.next, __field)) @@ -693,7 +693,7 @@ inline void exec_node::insert_before(exec_list *before) exec_node_data(__type, (__list)-tail_pred, __field), \ * __prev = \ exec_node_data(__type, (__node)-__field.prev, __field);\ -__prev != NULL;\ +__prev-__field != NULL; \ __node = __prev, __prev = \ exec_node_data(__type, (__prev)-__field.prev, __field)) -- 2.2.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] util: Fix foreach_list_typed_safe when exec_node is not at offset 0.
On Mon, Mar 9, 2015 at 6:36 PM, Kenneth Graunke kenn...@whitecape.org wrote: From: Jason Ekstrand jason.ekstr...@intel.com __next and __prev are pointers to the structure containing the exec_node link, not the embedded exec_node. NULL checks would fail unless the embedded exec_node happened to be at offset 0 in the parent struct. Signed-off-by: Jason Ekstrand jason.ekstr...@intel.com Reviewed-by: Kenneth Graunke kenn...@whitecape.org --- src/glsl/list.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/glsl/list.h b/src/glsl/list.h index ddb98f7..680e963 100644 --- a/src/glsl/list.h +++ b/src/glsl/list.h @@ -684,7 +684,7 @@ inline void exec_node::insert_before(exec_list *before) exec_node_data(__type, (__list)-head, __field),\ * __next = \ exec_node_data(__type, (__node)-__field.next, __field);\ -__next != NULL;\ +__next-__field != NULL; \ I'm not understanding now the address of __next-__field can ever be NULL. __next is something with an embedded struct exec_node, so don't we want __next-__field != NULL without the address-of operator? __node = __next, __next = \ exec_node_data(__type, (__next)-__field.next, __field)) @@ -693,7 +693,7 @@ inline void exec_node::insert_before(exec_list *before) exec_node_data(__type, (__list)-tail_pred, __field), \ * __prev = \ exec_node_data(__type, (__node)-__field.prev, __field);\ -__prev != NULL;\ +__prev-__field != NULL; \ __node = __prev, __prev = \ exec_node_data(__type, (__prev)-__field.prev, __field)) -- 2.2.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] util: Fix foreach_list_typed_safe when exec_node is not at offset 0.
On Mon, Mar 9, 2015 at 7:24 PM, Matt Turner matts...@gmail.com wrote: On Mon, Mar 9, 2015 at 6:36 PM, Kenneth Graunke kenn...@whitecape.org wrote: From: Jason Ekstrand jason.ekstr...@intel.com __next and __prev are pointers to the structure containing the exec_node link, not the embedded exec_node. NULL checks would fail unless the embedded exec_node happened to be at offset 0 in the parent struct. Signed-off-by: Jason Ekstrand jason.ekstr...@intel.com Reviewed-by: Kenneth Graunke kenn...@whitecape.org --- src/glsl/list.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/glsl/list.h b/src/glsl/list.h index ddb98f7..680e963 100644 --- a/src/glsl/list.h +++ b/src/glsl/list.h @@ -684,7 +684,7 @@ inline void exec_node::insert_before(exec_list *before) exec_node_data(__type, (__list)-head, __field), \ * __next = \ exec_node_data(__type, (__node)-__field.next, __field); \ -__next != NULL; \ +__next-__field != NULL; \ I'm not understanding now the address of __next-__field can ever be NULL. __next is something with an embedded struct exec_node, so don't we want __next-__field != NULL without the address-of operator? No, __field is the name of the exec_node field embedded in the __type struct. So if I have struct foo { /* stuff */ struct exec_node bar; } as my type, __type is struct foo, __next and __node are both of type __type *, and __field is bar. So, in order to get form __next to an exec_node, you have to do __next-__field because we need the actual exec_node back. Put differently, __next-field undoes the pointer arithmatic that exec_node_data(__type, ptr, __field) does. Ideallly, we would like __next to be a pointer of type struct exec_node and do the conversion to __type * later. Unfortunately, C doesn't let us do that inside the for loop. So we settle for extra pointer arithmetic. The other option, of course, would be to use a struct but then we couldn't make a variable named __node on behalf of the caller. --Jason __node = __next, __next = \ exec_node_data(__type, (__next)-__field.next, __field)) @@ -693,7 +693,7 @@ inline void exec_node::insert_before(exec_list *before) exec_node_data(__type, (__list)-tail_pred, __field), \ * __prev = \ exec_node_data(__type, (__node)-__field.prev, __field); \ -__prev != NULL; \ +__prev-__field != NULL; \ __node = __prev, __prev = \ exec_node_data(__type, (__prev)-__field.prev, __field)) -- 2.2.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] util: Fix foreach_list_typed_safe when exec_node is not at offset 0.
On Mon, Mar 9, 2015 at 7:48 PM, Connor Abbott cwabbo...@gmail.com wrote: On Mon, Mar 9, 2015 at 10:35 PM, Matt Turner matts...@gmail.com wrote: On Mon, Mar 9, 2015 at 7:24 PM, Matt Turner matts...@gmail.com wrote: On Mon, Mar 9, 2015 at 6:36 PM, Kenneth Graunke kenn...@whitecape.org wrote: From: Jason Ekstrand jason.ekstr...@intel.com __next and __prev are pointers to the structure containing the exec_node link, not the embedded exec_node. NULL checks would fail unless the embedded exec_node happened to be at offset 0 in the parent struct. Signed-off-by: Jason Ekstrand jason.ekstr...@intel.com Reviewed-by: Kenneth Graunke kenn...@whitecape.org --- src/glsl/list.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/glsl/list.h b/src/glsl/list.h index ddb98f7..680e963 100644 --- a/src/glsl/list.h +++ b/src/glsl/list.h @@ -684,7 +684,7 @@ inline void exec_node::insert_before(exec_list *before) exec_node_data(__type, (__list)-head, __field), \ * __next = \ exec_node_data(__type, (__node)-__field.next, __field); \ -__next != NULL; \ +__next-__field != NULL; \ I'm not understanding now the address of __next-__field can ever be NULL. __next is something with an embedded struct exec_node, so don't we want __next-__field != NULL without the address-of operator? Sorry, that should have been exec_node_is_tail_sentinel(__next-__field) No, that won't work. We want to bail out if the *current* node is the tail sentinel, not the next one. If the next node is the tail sentinel, then the current one is the last element, so we have to go through the loop once more. We could use exec_node_is_tail_sentinel() on the current node, No, we can't. The whole point of the *_safe versions is that we never touch the current node after we've let the client execute code. We stash off the next one so that, even if the delete the current one, we still have something to give them next iteration. --Jason but we've already dereferenced the pointer earlier when getting the next node so it would be less efficient to dereference it again. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] util: Fix foreach_list_typed_safe when exec_node is not at offset 0.
On Mon, Mar 9, 2015 at 11:02 PM, Jason Ekstrand ja...@jlekstrand.net wrote: On Mon, Mar 9, 2015 at 7:59 PM, Connor Abbott cwabbo...@gmail.com wrote: On Mon, Mar 9, 2015 at 10:54 PM, Jason Ekstrand ja...@jlekstrand.net wrote: On Mon, Mar 9, 2015 at 7:48 PM, Connor Abbott cwabbo...@gmail.com wrote: On Mon, Mar 9, 2015 at 10:35 PM, Matt Turner matts...@gmail.com wrote: On Mon, Mar 9, 2015 at 7:24 PM, Matt Turner matts...@gmail.com wrote: On Mon, Mar 9, 2015 at 6:36 PM, Kenneth Graunke kenn...@whitecape.org wrote: From: Jason Ekstrand jason.ekstr...@intel.com __next and __prev are pointers to the structure containing the exec_node link, not the embedded exec_node. NULL checks would fail unless the embedded exec_node happened to be at offset 0 in the parent struct. Signed-off-by: Jason Ekstrand jason.ekstr...@intel.com Reviewed-by: Kenneth Graunke kenn...@whitecape.org --- src/glsl/list.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/glsl/list.h b/src/glsl/list.h index ddb98f7..680e963 100644 --- a/src/glsl/list.h +++ b/src/glsl/list.h @@ -684,7 +684,7 @@ inline void exec_node::insert_before(exec_list *before) exec_node_data(__type, (__list)-head, __field), \ * __next = \ exec_node_data(__type, (__node)-__field.next, __field); \ -__next != NULL; \ +__next-__field != NULL; \ I'm not understanding now the address of __next-__field can ever be NULL. __next is something with an embedded struct exec_node, so don't we want __next-__field != NULL without the address-of operator? Sorry, that should have been exec_node_is_tail_sentinel(__next-__field) No, that won't work. We want to bail out if the *current* node is the tail sentinel, not the next one. If the next node is the tail sentinel, then the current one is the last element, so we have to go through the loop once more. We could use exec_node_is_tail_sentinel() on the current node, No, we can't. The whole point of the *_safe versions is that we never touch the current node after we've let the client execute code. We stash off the next one so that, even if the delete the current one, we still have something to give them next iteration. --Jason Ah, right. My only concern is that doing pointer arithmetic on NULL isn't defined, and given that what we're doing involves underflowing the pointer so it wraps around to a large address (when subtracting in exec_node_get_data()) and then overflowing it back to 0 (when doing __next-field), it's likely that some compiler might come along and optimize this. We could cast everything through uintptr_t but that's gonna get messy... Yeah... currently what compilers do is equivalent to casting to uintptr_t, but my concern is that eventually, some compiler writer says hey, pointer arithmetic never overflows! and implements tricks similar to what compilers do today with signed integer overflow. I'm not sure how likely that is; I'd guess it's not too likely though. So given that there's not much we can do that's not going to be very messy, I guess it's ok to leave it how it is now as long as we don't forget about this in case it does happen eventually. but we've already dereferenced the pointer earlier when getting the next node so it would be less efficient to dereference it again. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] nir: Fix non-determinism in nir_lower_vars_to_ssa().
Reviewed-by: Connor Abbott cwabbo...@gmail.com I was in the middle of rewriting this pass for making derefs instructions, which hasn't been going nearly as nicely as I would like (ugh...), so if it pans out then I'll have to think about it a little more to make sure the new version is deterministic too. On Mon, Mar 9, 2015 at 9:36 PM, Kenneth Graunke kenn...@whitecape.org wrote: Previously, we stored derefs in a hash table, using the malloc'd pointer as the key. Then, we walked through the hash table and generated code, based on the order of the hash table's elements. Memory addresses returned by malloc are pretty much random, which meant that the hash was random, and the hash table's elements would be walked in some random order. This led to successive compiles of the same shader using different variable names and slightly different orderings of phi-nodes. Code could not be diff'd, and the final assembly would sometimes change slightly too. It turns out the only point of the hash table was to avoid inserting the same node multiple times for different dereferences. We never actually searched the hash table! This patch uses an intrusive linked list instead. Since exec_list uses head and tail sentinels, checking prev or next against NULL will tell us whether the node is already in the list. Pair programming with Jason Ekstrand. Signed-off-by: Jason Ekstrand jason.ekstr...@intel.com Signed-off-by: Kenneth Graunke kenn...@whitecape.org --- src/glsl/nir/nir_lower_vars_to_ssa.c | 123 --- 1 file changed, 26 insertions(+), 97 deletions(-) diff --git a/src/glsl/nir/nir_lower_vars_to_ssa.c b/src/glsl/nir/nir_lower_vars_to_ssa.c index 9e9a418..86e6ab4 100644 --- a/src/glsl/nir/nir_lower_vars_to_ssa.c +++ b/src/glsl/nir/nir_lower_vars_to_ssa.c @@ -35,6 +35,13 @@ struct deref_node { bool lower_to_ssa; + /* Only valid for things that end up in the direct list. +* Note that multiple nir_deref_vars may correspond to this node, but they +* will all be equivalent, so any is as good as the other. +*/ + nir_deref_var *deref; + struct exec_node direct_derefs_link; + struct set *loads; struct set *stores; struct set *copies; @@ -69,7 +76,7 @@ struct lower_variables_state { * wildcards and no indirects, these are precisely the derefs that we * can actually consider lowering. */ - struct hash_table *direct_deref_nodes; + struct exec_list direct_deref_nodes; /* Controls whether get_deref_node will add variables to the * direct_deref_nodes table. This is turned on when we are initially @@ -83,88 +90,6 @@ struct lower_variables_state { struct hash_table *phi_table; }; -/* The following two functions implement a hash and equality check for - * variable dreferences. When the hash or equality function encounters an - * array, all indirects are treated as equal and are never equal to a - * direct dereference or a wildcard. - */ -static uint32_t -hash_deref(const void *void_deref) -{ - uint32_t hash = _mesa_fnv32_1a_offset_bias; - - const nir_deref_var *deref_var = void_deref; - hash = _mesa_fnv32_1a_accumulate(hash, deref_var-var); - - for (const nir_deref *deref = deref_var-deref.child; -deref; deref = deref-child) { - switch (deref-deref_type) { - case nir_deref_type_array: { - nir_deref_array *deref_array = nir_deref_as_array(deref); - - hash = _mesa_fnv32_1a_accumulate(hash, deref_array-deref_array_type); - - if (deref_array-deref_array_type == nir_deref_array_type_direct) -hash = _mesa_fnv32_1a_accumulate(hash, deref_array-base_offset); - break; - } - case nir_deref_type_struct: { - nir_deref_struct *deref_struct = nir_deref_as_struct(deref); - hash = _mesa_fnv32_1a_accumulate(hash, deref_struct-index); - break; - } - default: - assert(Invalid deref chain); - } - } - - return hash; -} - -static bool -derefs_equal(const void *void_a, const void *void_b) -{ - const nir_deref_var *a_var = void_a; - const nir_deref_var *b_var = void_b; - - if (a_var-var != b_var-var) - return false; - - for (const nir_deref *a = a_var-deref.child, *b = b_var-deref.child; -a != NULL; a = a-child, b = b-child) { - if (a-deref_type != b-deref_type) - return false; - - switch (a-deref_type) { - case nir_deref_type_array: { - nir_deref_array *a_arr = nir_deref_as_array(a); - nir_deref_array *b_arr = nir_deref_as_array(b); - - if (a_arr-deref_array_type != b_arr-deref_array_type) -return false; - - if (a_arr-deref_array_type == nir_deref_array_type_direct - a_arr-base_offset != b_arr-base_offset) -return false; - break; - } - case nir_deref_type_struct: -
Re: [Mesa-dev] [PATCH 1/2] util: Fix foreach_list_typed_safe when exec_node is not at offset 0.
On Mon, Mar 9, 2015 at 10:35 PM, Matt Turner matts...@gmail.com wrote: On Mon, Mar 9, 2015 at 7:24 PM, Matt Turner matts...@gmail.com wrote: On Mon, Mar 9, 2015 at 6:36 PM, Kenneth Graunke kenn...@whitecape.org wrote: From: Jason Ekstrand jason.ekstr...@intel.com __next and __prev are pointers to the structure containing the exec_node link, not the embedded exec_node. NULL checks would fail unless the embedded exec_node happened to be at offset 0 in the parent struct. Signed-off-by: Jason Ekstrand jason.ekstr...@intel.com Reviewed-by: Kenneth Graunke kenn...@whitecape.org --- src/glsl/list.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/glsl/list.h b/src/glsl/list.h index ddb98f7..680e963 100644 --- a/src/glsl/list.h +++ b/src/glsl/list.h @@ -684,7 +684,7 @@ inline void exec_node::insert_before(exec_list *before) exec_node_data(__type, (__list)-head, __field), \ * __next = \ exec_node_data(__type, (__node)-__field.next, __field); \ -__next != NULL; \ +__next-__field != NULL; \ I'm not understanding now the address of __next-__field can ever be NULL. __next is something with an embedded struct exec_node, so don't we want __next-__field != NULL without the address-of operator? Sorry, that should have been exec_node_is_tail_sentinel(__next-__field) No, that won't work. We want to bail out if the *current* node is the tail sentinel, not the next one. If the next node is the tail sentinel, then the current one is the last element, so we have to go through the loop once more. We could use exec_node_is_tail_sentinel() on the current node, but we've already dereferenced the pointer earlier when getting the next node so it would be less efficient to dereference it again. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 5/6] i965/skl: Align compressed textures to four times the block size
On Fri, Feb 20, 2015 at 10:31:07PM +, Neil Roberts wrote: On Skylake it is possible to choose your own alignment values for compressed textures but they are expressed as a multiple of the block size. The minimum alignment value we can use is 4 so we effectively have to align to 4 times the block size. This patch makes it initially set mt-align_[wh] to the large alignment value and then later divides it by the block size so that it can be uploaded as part of the surface state. --- src/mesa/drivers/dri/i965/brw_tex_layout.c | 31 ++ 1 file changed, 27 insertions(+), 4 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_tex_layout.c b/src/mesa/drivers/dri/i965/brw_tex_layout.c index d89a04e..1fe2c26 100644 --- a/src/mesa/drivers/dri/i965/brw_tex_layout.c +++ b/src/mesa/drivers/dri/i965/brw_tex_layout.c @@ -73,7 +73,16 @@ intel_horizontal_texture_alignment_unit(struct brw_context *brw, */ unsigned int i, j; _mesa_get_format_block_size(format, i, j); - return i; + + /* On Gen9+ we can pick our own alignment for compressed textures but it + * has to be a multiple of the block size. The minimum alignment we can + * pick is 4 so we effectively have to align to 4 times the block + * size + */ + if (brw-gen = 9) + return i * 4; + else + return i; Sorry for the delay, but I put this off initially because I wasn't sure which part of the docs this was addressing. I see that the Surface Horizontal Alignment field is now used for compressed formats. Assuming that's what this is referring to (if not, please correct me)... The only thing that appears to be missing is the handling of the MCS case (must always be 16) which may or may not be relevant. AFAICT, it's a possible scenario. Also, doesn't this make FXT1 have an alignment of 32? } if (format == MESA_FORMAT_S_UINT8) @@ -113,7 +122,8 @@ intel_vertical_texture_alignment_unit(struct brw_context *brw, * the SURFACE_STATE Surface Vertical Alignment field. */ if (_mesa_is_format_compressed(format)) - return 4; + /* See comment above for the horizontal alignment */ + return brw-gen = 9 ? 16 : 4; This seems okay since the max height we support is 4. if (format == MESA_FORMAT_S_UINT8) return brw-gen = 7 ? 8 : 4; @@ -195,6 +205,9 @@ brw_miptree_layout_2d(struct intel_mipmap_tree *mt) unsigned width = mt-physical_width0; unsigned height = mt-physical_height0; unsigned depth = mt-physical_depth0; /* number of array layers. */ + unsigned int bw, bh; + + _mesa_get_format_block_size(mt-format, bw, bh); mt-total_width = mt-physical_width0; @@ -212,7 +225,7 @@ brw_miptree_layout_2d(struct intel_mipmap_tree *mt) if (mt-compressed) { mip1_width = ALIGN(minify(mt-physical_width0, 1), mt-align_w) + - ALIGN(minify(mt-physical_width0, 2), mt-align_w); + ALIGN(minify(mt-physical_width0, 2), bw); } else { mip1_width = ALIGN(minify(mt-physical_width0, 1), mt-align_w) + minify(mt-physical_width0, 2); @@ -232,7 +245,7 @@ brw_miptree_layout_2d(struct intel_mipmap_tree *mt) img_height = ALIGN(height, mt-align_h); if (mt-compressed) - img_height /= mt-align_h; + img_height /= bh; if (mt-array_layout == ALL_SLICES_AT_EACH_LOD) { /* Compact arrays with separated miplevels */ @@ -481,5 +494,15 @@ brw_miptree_layout(struct brw_context *brw, struct intel_mipmap_tree *mt) } DBG(%s: %dx%dx%d\n, __FUNCTION__, mt-total_width, mt-total_height, mt-cpp); + + /* On Gen9+ the alignment values are expressed in multiples of the block +* size +*/ + if (brw-gen = 9) { + unsigned int i, j; + _mesa_get_format_block_size(mt-format, i, j); + mt-align_w /= i; + mt-align_h /= j; + } } These hunks look okay to me. Any particular reason not to update brw_miptree_layout_texture_array as well? I am pretty certain we don't use ALL_SLICES_AT_EACH_LOD in Gen9 today, but presumably we could, right? /me shrugs -- Ben Widawsky, Intel Open Source Technology Center ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] GL/GLSL tests for GL 4.0 and newer
*I am interested on write tests for OpenGL 4.0 /GLSL 4.00 .* *But can you be more specify what areas you are expecting to be tested, Thank you!* ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH ] vbo: improve the code style by adjust the preprocessing c code directives.
From: Marius Predut marius.pre...@intel.com Brain Paul review suggestion: there's more macro use here than necessary. Removed and redefine some #define preprocessing directives. Removed the directive input parameter 'T' . No functional changes. Signed-off-by: Marius Predut marius.pre...@intel.com --- src/mesa/vbo/vbo_attrib_tmp.h | 74 ++--- src/mesa/vbo/vbo_exec_api.c |2 +- 2 files changed, 34 insertions(+), 42 deletions(-) diff --git a/src/mesa/vbo/vbo_attrib_tmp.h b/src/mesa/vbo/vbo_attrib_tmp.h index 80e8aaf..348dd77 100644 --- a/src/mesa/vbo/vbo_attrib_tmp.h +++ b/src/mesa/vbo/vbo_attrib_tmp.h @@ -30,35 +30,30 @@ USE OR OTHER DEALINGS IN THE SOFTWARE. /* ATTR */ -#define ATTR( A, N, T, V0, V1, V2, V3 ) \ -ATTR_##T((A), (N), (T), (V0), (V1), (V2), (V3)) - -#define ATTR_GL_UNSIGNED_INT( A, N, T, V0, V1, V2, V3 ) \ -ATTR_UNION(A, N, T, UINT_AS_UNION(V0), UINT_AS_UNION(V1), \ -UINT_AS_UNION(V2), UINT_AS_UNION(V3)) -#define ATTR_GL_INT( A, N, T, V0, V1, V2, V3 ) \ -ATTR_UNION(A, N, T, INT_AS_UNION(V0), INT_AS_UNION(V1), \ +#define ATTRI( A, N, V0, V1, V2, V3 ) \ +ATTR_UNION(A, N, GL_INT, INT_AS_UNION(V0), INT_AS_UNION(V1), \ INT_AS_UNION(V2), INT_AS_UNION(V3)) -#define ATTR_GL_FLOAT( A, N, T, V0, V1, V2, V3 ) \ -ATTR_UNION(A, N, T, FLOAT_AS_UNION(V0), FLOAT_AS_UNION(V1),\ +#define ATTRUI( A, N, V0, V1, V2, V3 ) \ +ATTR_UNION(A, N, GL_UNSIGNED_INT, UINT_AS_UNION(V0), UINT_AS_UNION(V1), \ +UINT_AS_UNION(V2), UINT_AS_UNION(V3)) +#define ATTRF( A, N, V0, V1, V2, V3 ) \ +ATTR_UNION(A, N, GL_FLOAT, FLOAT_AS_UNION(V0), FLOAT_AS_UNION(V1),\ FLOAT_AS_UNION(V2), FLOAT_AS_UNION(V3)) /* float */ -#define ATTR1FV( A, V ) ATTR( A, 1, GL_FLOAT, (V)[0], 0, 0, 1 ) -#define ATTR2FV( A, V ) ATTR( A, 2, GL_FLOAT, (V)[0], (V)[1], 0, 1 ) -#define ATTR3FV( A, V ) ATTR( A, 3, GL_FLOAT, (V)[0], (V)[1], (V)[2], 1 ) -#define ATTR4FV( A, V ) ATTR( A, 4, GL_FLOAT, (V)[0], (V)[1], (V)[2], (V)[3] ) +#define ATTR1FV( A, V ) ATTRF( A, 1, (V)[0], 0, 0, 1 ) +#define ATTR2FV( A, V ) ATTRF( A, 2, (V)[0], (V)[1], 0, 1 ) +#define ATTR3FV( A, V ) ATTRF( A, 3, (V)[0], (V)[1], (V)[2], 1 ) +#define ATTR4FV( A, V ) ATTRF( A, 4, (V)[0], (V)[1], (V)[2], (V)[3] ) -#define ATTR1F( A, X ) ATTR( A, 1, GL_FLOAT, X, 0, 0, 1 ) -#define ATTR2F( A, X, Y ) ATTR( A, 2, GL_FLOAT, X, Y, 0, 1 ) -#define ATTR3F( A, X, Y, Z )ATTR( A, 3, GL_FLOAT, X, Y, Z, 1 ) -#define ATTR4F( A, X, Y, Z, W ) ATTR( A, 4, GL_FLOAT, X, Y, Z, W ) +#define ATTR1F( A, X ) ATTRF( A, 1, X, 0, 0, 1 ) +#define ATTR2F( A, X, Y ) ATTRF( A, 2, X, Y, 0, 1 ) +#define ATTR3F( A, X, Y, Z )ATTRF( A, 3, X, Y, Z, 1 ) +#define ATTR4F( A, X, Y, Z, W ) ATTRF( A, 4, X, Y, Z, W ) -/* int */ -#define ATTRI( A, N, X, Y, Z, W) ATTR( A, N, GL_INT, \ - X, Y, Z, W ) +/* int */ #define ATTR2IV( A, V ) ATTRI( A, 2, (V)[0], (V)[1], 0, 1 ) #define ATTR3IV( A, V ) ATTRI( A, 3, (V)[0], (V)[1], (V)[2], 1 ) #define ATTR4IV( A, V ) ATTRI( A, 4, (V)[0], (V)[1], (V)[2], (V)[3] ) @@ -70,9 +65,6 @@ USE OR OTHER DEALINGS IN THE SOFTWARE. /* uint */ -#define ATTRUI( A, N, X, Y, Z, W) ATTR( A, N, GL_UNSIGNED_INT, \ -X, Y, Z, W ) - #define ATTR2UIV( A, V ) ATTRUI( A, 2, (V)[0], (V)[1], 0, 1 ) #define ATTR3UIV( A, V ) ATTRUI( A, 3, (V)[0], (V)[1], (V)[2], 1 ) #define ATTR4UIV( A, V ) ATTRUI( A, 4, (V)[0], (V)[1], (V)[2], (V)[3] ) @@ -82,7 +74,7 @@ USE OR OTHER DEALINGS IN THE SOFTWARE. #define ATTR3UI( A, X, Y, Z )ATTRUI( A, 3, X, Y, Z, 1 ) #define ATTR4UI( A, X, Y, Z, W ) ATTRUI( A, 4, X, Y, Z, W ) -#define MAT_ATTR( A, N, V ) ATTR( A, N, GL_FLOAT, (V)[0], (V)[1], (V)[2], (V)[3] ) +#define MAT_ATTR( A, N, V ) ATTRF( A, N, (V)[0], (V)[1], (V)[2], (V)[3] ) static inline float conv_ui10_to_norm_float(unsigned ui10) { @@ -94,20 +86,20 @@ static inline float conv_ui2_to_norm_float(unsigned ui2) return ui2 / 3.0f; } -#define ATTRUI10_1( A, UI ) ATTR( A, 1, GL_FLOAT, (UI) 0x3ff, 0, 0, 1 ) -#define ATTRUI10_2( A, UI ) ATTR( A, 2, GL_FLOAT, (UI) 0x3ff, ((UI) 10) 0x3ff, 0, 1 ) -#define ATTRUI10_3( A, UI ) ATTR( A, 3, GL_FLOAT, (UI) 0x3ff, ((UI) 10) 0x3ff, ((UI) 20) 0x3ff, 1 ) -#define ATTRUI10_4( A, UI ) ATTR( A, 4, GL_FLOAT, (UI) 0x3ff, ((UI) 10) 0x3ff, ((UI) 20) 0x3ff, ((UI) 30) 0x3 ) +#define ATTRUI10_1( A, UI ) ATTRF( A, 1, (UI) 0x3ff, 0, 0, 1 ) +#define ATTRUI10_2( A, UI ) ATTRF( A, 2, (UI) 0x3ff, ((UI) 10) 0x3ff, 0, 1 ) +#define ATTRUI10_3( A, UI ) ATTRF( A, 3, (UI) 0x3ff, ((UI) 10) 0x3ff, ((UI) 20) 0x3ff, 1 ) +#define ATTRUI10_4( A, UI ) ATTRF( A, 4, (UI) 0x3ff, ((UI) 10) 0x3ff, ((UI) 20) 0x3ff, ((UI) 30) 0x3 ) -#define ATTRUI10N_1( A, UI ) ATTR( A, 1, GL_FLOAT, conv_ui10_to_norm_float((UI) 0x3ff), 0, 0, 1 ) -#define ATTRUI10N_2( A, UI ) ATTR( A, 2, GL_FLOAT, \ +#define ATTRUI10N_1( A, UI ) ATTRF( A, 1,
Re: [Mesa-dev] [PATCH 2/2] nir: Fix non-determinism in nir_lower_vars_to_ssa().
On Mon, Mar 9, 2015 at 7:04 PM, Jason Ekstrand ja...@jlekstrand.net wrote: Push it! Our policy is to wait a day for most things. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/4] Whitespace cleanup
On 09.03.2015 18:06, Giuseppe Bilotta wrote: On Mon, Mar 9, 2015 at 5:01 AM, Michel Dänzer mic...@daenzer.net wrote: The shortlog of patch 4 should be prefixed by gallium: as well. Duh, I forgot the prefix everywhere. And the signoff line. Specifically about the last patch, the one that actually touches clover, is there a criteria for when to use gallium: and when to use clover: as prefix? I've seen either being used when looking at the log, sometimes even both at the same time. A change which touches only clover code should have clover: or st/clover: as the prefix. Maybe some of the changes you saw were more general changes involving other parts of Gallium as well. -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] util: Fix foreach_list_typed_safe when exec_node is not at offset 0.
On Mon, Mar 9, 2015 at 7:32 PM, Jason Ekstrand ja...@jlekstrand.net wrote: On Mon, Mar 9, 2015 at 7:24 PM, Matt Turner matts...@gmail.com wrote: On Mon, Mar 9, 2015 at 6:36 PM, Kenneth Graunke kenn...@whitecape.org wrote: From: Jason Ekstrand jason.ekstr...@intel.com __next and __prev are pointers to the structure containing the exec_node link, not the embedded exec_node. NULL checks would fail unless the embedded exec_node happened to be at offset 0 in the parent struct. Signed-off-by: Jason Ekstrand jason.ekstr...@intel.com Reviewed-by: Kenneth Graunke kenn...@whitecape.org --- src/glsl/list.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/glsl/list.h b/src/glsl/list.h index ddb98f7..680e963 100644 --- a/src/glsl/list.h +++ b/src/glsl/list.h @@ -684,7 +684,7 @@ inline void exec_node::insert_before(exec_list *before) exec_node_data(__type, (__list)-head, __field), \ * __next = \ exec_node_data(__type, (__node)-__field.next, __field); \ -__next != NULL; \ +__next-__field != NULL; \ I'm not understanding now the address of __next-__field can ever be NULL. __next is something with an embedded struct exec_node, so don't we want __next-__field != NULL without the address-of operator? No, __field is the name of the exec_node field embedded in the __type struct. So if I have struct foo { /* stuff */ struct exec_node bar; } as my type, __type is struct foo, __next and __node are both of type __type *, and __field is bar. So, in order to get form __next to an exec_node, you have to do __next-__field because we need the actual exec_node back. More concretely, for something like: foreach_list_typed_safe (bblock_link, successor, link, predecessor-block-children) I think this macro expands to (after this patch) for (bblock_link * successor = exec_node_data(bblock_link, (predecessor-block-children)-head, link), * __next = exec_node_data(bblock_link, (successor)-link.next, link); __next-link != NULL; successor = __next, __next = exec_node_data(bblock_link, (__next)-link.next, link)) How can the address of a field in a struct (e.g., __next-link) be NULL? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] util: Fix foreach_list_typed_safe when exec_node is not at offset 0.
On Mon, Mar 9, 2015 at 10:54 PM, Jason Ekstrand ja...@jlekstrand.net wrote: On Mon, Mar 9, 2015 at 7:48 PM, Connor Abbott cwabbo...@gmail.com wrote: On Mon, Mar 9, 2015 at 10:35 PM, Matt Turner matts...@gmail.com wrote: On Mon, Mar 9, 2015 at 7:24 PM, Matt Turner matts...@gmail.com wrote: On Mon, Mar 9, 2015 at 6:36 PM, Kenneth Graunke kenn...@whitecape.org wrote: From: Jason Ekstrand jason.ekstr...@intel.com __next and __prev are pointers to the structure containing the exec_node link, not the embedded exec_node. NULL checks would fail unless the embedded exec_node happened to be at offset 0 in the parent struct. Signed-off-by: Jason Ekstrand jason.ekstr...@intel.com Reviewed-by: Kenneth Graunke kenn...@whitecape.org --- src/glsl/list.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/glsl/list.h b/src/glsl/list.h index ddb98f7..680e963 100644 --- a/src/glsl/list.h +++ b/src/glsl/list.h @@ -684,7 +684,7 @@ inline void exec_node::insert_before(exec_list *before) exec_node_data(__type, (__list)-head, __field), \ * __next = \ exec_node_data(__type, (__node)-__field.next, __field); \ -__next != NULL; \ +__next-__field != NULL; \ I'm not understanding now the address of __next-__field can ever be NULL. __next is something with an embedded struct exec_node, so don't we want __next-__field != NULL without the address-of operator? Sorry, that should have been exec_node_is_tail_sentinel(__next-__field) No, that won't work. We want to bail out if the *current* node is the tail sentinel, not the next one. If the next node is the tail sentinel, then the current one is the last element, so we have to go through the loop once more. We could use exec_node_is_tail_sentinel() on the current node, No, we can't. The whole point of the *_safe versions is that we never touch the current node after we've let the client execute code. We stash off the next one so that, even if the delete the current one, we still have something to give them next iteration. --Jason Ah, right. My only concern is that doing pointer arithmetic on NULL isn't defined, and given that what we're doing involves underflowing the pointer so it wraps around to a large address (when subtracting in exec_node_get_data()) and then overflowing it back to 0 (when doing __next-field), it's likely that some compiler might come along and optimize this. but we've already dereferenced the pointer earlier when getting the next node so it would be less efficient to dereference it again. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] util: Fix foreach_list_typed_safe when exec_node is not at offset 0.
On Mon, Mar 9, 2015 at 7:59 PM, Connor Abbott cwabbo...@gmail.com wrote: On Mon, Mar 9, 2015 at 10:54 PM, Jason Ekstrand ja...@jlekstrand.net wrote: On Mon, Mar 9, 2015 at 7:48 PM, Connor Abbott cwabbo...@gmail.com wrote: On Mon, Mar 9, 2015 at 10:35 PM, Matt Turner matts...@gmail.com wrote: On Mon, Mar 9, 2015 at 7:24 PM, Matt Turner matts...@gmail.com wrote: On Mon, Mar 9, 2015 at 6:36 PM, Kenneth Graunke kenn...@whitecape.org wrote: From: Jason Ekstrand jason.ekstr...@intel.com __next and __prev are pointers to the structure containing the exec_node link, not the embedded exec_node. NULL checks would fail unless the embedded exec_node happened to be at offset 0 in the parent struct. Signed-off-by: Jason Ekstrand jason.ekstr...@intel.com Reviewed-by: Kenneth Graunke kenn...@whitecape.org --- src/glsl/list.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/glsl/list.h b/src/glsl/list.h index ddb98f7..680e963 100644 --- a/src/glsl/list.h +++ b/src/glsl/list.h @@ -684,7 +684,7 @@ inline void exec_node::insert_before(exec_list *before) exec_node_data(__type, (__list)-head, __field), \ * __next = \ exec_node_data(__type, (__node)-__field.next, __field); \ -__next != NULL; \ +__next-__field != NULL; \ I'm not understanding now the address of __next-__field can ever be NULL. __next is something with an embedded struct exec_node, so don't we want __next-__field != NULL without the address-of operator? Sorry, that should have been exec_node_is_tail_sentinel(__next-__field) No, that won't work. We want to bail out if the *current* node is the tail sentinel, not the next one. If the next node is the tail sentinel, then the current one is the last element, so we have to go through the loop once more. We could use exec_node_is_tail_sentinel() on the current node, No, we can't. The whole point of the *_safe versions is that we never touch the current node after we've let the client execute code. We stash off the next one so that, even if the delete the current one, we still have something to give them next iteration. --Jason Ah, right. My only concern is that doing pointer arithmetic on NULL isn't defined, and given that what we're doing involves underflowing the pointer so it wraps around to a large address (when subtracting in exec_node_get_data()) and then overflowing it back to 0 (when doing __next-field), it's likely that some compiler might come along and optimize this. We could cast everything through uintptr_t but that's gonna get messy... but we've already dereferenced the pointer earlier when getting the next node so it would be less efficient to dereference it again. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/4] Whitespace cleanup
On Mon, Mar 9, 2015 at 5:01 AM, Michel Dänzer mic...@daenzer.net wrote: The shortlog of patch 4 should be prefixed by gallium: as well. Duh, I forgot the prefix everywhere. And the signoff line. Specifically about the last patch, the one that actually touches clover, is there a criteria for when to use gallium: and when to use clover: as prefix? I've seen either being used when looking at the log, sometimes even both at the same time. (Just so that I can try and get it right on the first submission --hahah-- for the next patchset). -- Giuseppe Oblomov Bilotta ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 6/9] i965/fs: Refactor fs_visitor::nir_setup_inputs().
No functional change. In preparation for supporting vertex shaders, this adds a switch statement on shader stage (since vertex attributes and fragment shader varyings will need different handling). It also renames varying to input, to be more general. Signed-off-by: Kenneth Graunke kenn...@whitecape.org --- src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 29 +++-- 1 file changed, 19 insertions(+), 10 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp index d700523..3baafc4 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp @@ -199,18 +199,27 @@ fs_visitor::nir_setup_inputs(nir_shader *shader) struct hash_entry *entry; hash_table_foreach(shader-inputs, entry) { nir_variable *var = (nir_variable *) entry-data; - fs_reg varying = offset(nir_inputs, var-data.driver_location); + fs_reg input = offset(nir_inputs, var-data.driver_location); fs_reg reg; - if (var-data.location == VARYING_SLOT_POS) { - reg = *emit_fragcoord_interpolation(var-data.pixel_center_integer, - var-data.origin_upper_left); - emit_percomp(MOV(varying, reg), 0xF); - } else { - emit_general_interpolation(varying, var-name, var-type, -(glsl_interp_qualifier) var-data.interpolation, -var-data.location, var-data.centroid, -var-data.sample); + switch (stage) { + case MESA_SHADER_VERTEX: + case MESA_SHADER_GEOMETRY: + case MESA_SHADER_COMPUTE: + unreachable(fs_visitor not used for these stages yet.); + break; + case MESA_SHADER_FRAGMENT: + if (var-data.location == VARYING_SLOT_POS) { +reg = *emit_fragcoord_interpolation(var-data.pixel_center_integer, +var-data.origin_upper_left); +emit_percomp(MOV(input, reg), 0xF); + } else { +emit_general_interpolation(input, var-name, var-type, + (glsl_interp_qualifier) var-data.interpolation, + var-data.location, var-data.centroid, + var-data.sample); + } + break; } } } -- 2.2.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 8/9] i965/fs: Add VS output support to nir_setup_outputs().
Adapted from fs_visitor::visit(ir_variable *). Signed-off-by: Kenneth Graunke kenn...@whitecape.org --- src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 16 +++- 1 file changed, 11 insertions(+), 5 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp index 1734d03..d03a337 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp @@ -255,7 +255,17 @@ fs_visitor::nir_setup_outputs(nir_shader *shader) nir_variable *var = (nir_variable *) entry-data; fs_reg reg = offset(nir_outputs, var-data.driver_location); - if (var-data.index 0) { + int vector_elements = + var-type-is_array() ? var-type-fields.array-vector_elements + : var-type-vector_elements; + + if (stage == MESA_SHADER_VERTEX) { + for (int i = 0; i ALIGN(type_size(var-type), 4) / 4; i++) { +int output = var-data.location + i; +this-outputs[output] = offset(reg, 4 * i); +this-output_components[output] = vector_elements; + } + } else if (var-data.index 0) { assert(var-data.location == FRAG_RESULT_DATA0); assert(var-data.index == 1); this-dual_src_output = reg; @@ -275,10 +285,6 @@ fs_visitor::nir_setup_outputs(nir_shader *shader) assert(var-data.location = FRAG_RESULT_DATA0 var-data.location FRAG_RESULT_DATA0 + BRW_MAX_DRAW_BUFFERS); - int vector_elements = -var-type-is_array() ? var-type-fields.array-vector_elements - : var-type-vector_elements; - /* General color output. */ for (unsigned int i = 0; i MAX2(1, var-type-length); i++) { int output = var-data.location - FRAG_RESULT_DATA0 + i; -- 2.2.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/9] i965/nir: Optimize after nir_lower_var_copies().
Array variable copy splitting generates a bunch of stuff we want to clean up before proceeding. Signed-off-by: Kenneth Graunke kenn...@whitecape.org Cc: Jason Ekstrand ja...@jlekstrand.net --- src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 3 +++ 1 file changed, 3 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp index 249e59c..fbdfc22 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp @@ -102,6 +102,9 @@ fs_visitor::emit_nir_code() nir_lower_var_copies(nir); nir_validate_shader(nir); + /* Get rid of split copies */ + nir_optimize(nir); + nir_lower_io(nir); nir_validate_shader(nir); -- 2.2.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 9/9] i965: Use NIR for scalar VS when INTEL_USE_NIR is set.
Signed-off-by: Kenneth Graunke kenn...@whitecape.org --- src/mesa/drivers/dri/i965/brw_fs.cpp | 15 ++- 1 file changed, 10 insertions(+), 5 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index 428234f..ee5bc4a 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -3818,12 +3818,17 @@ fs_visitor::run_vs() if (INTEL_DEBUG DEBUG_SHADER_TIME) emit_shader_time_begin(); - foreach_in_list(ir_instruction, ir, shader-base.ir) { - base_ir = ir; - this-result = reg_undef; - ir-accept(this); + if (getenv(INTEL_USE_NIR) != NULL) { + emit_nir_code(); + } else { + foreach_in_list(ir_instruction, ir, shader-base.ir) { + base_ir = ir; + this-result = reg_undef; + ir-accept(this); + } + base_ir = NULL; } - base_ir = NULL; + if (failed) return false; -- 2.2.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev