[Mesa-dev] [PATCH] nv50/ir: process texture offset sources as regular sources
With ARB_gpu_shader5, texture offsets can be any source, including TEMPs and IN's. Make sure to process them as regular sources so that we pick up masks, etc. This should fix some CTS tests that feed offsets directly to textureGatherOffset, and we were not picking up the input use, thus not advertising it in the shader header. Signed-off-by: Ilia MirkinCc: mesa-sta...@lists.freedesktop.org --- .../drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp | 146 + 1 file changed, 93 insertions(+), 53 deletions(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp index fe71f58..05076e1 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp @@ -182,6 +182,7 @@ public: // mask of used components of source s unsigned int srcMask(unsigned int s) const; + unsigned int texOffsetMask() const; SrcRegister getSrc(unsigned int s) const { @@ -234,6 +235,34 @@ private: const struct tgsi_full_instruction *insn; }; +unsigned int Instruction::texOffsetMask() const +{ + const struct tgsi_instruction_texture *tex = >Texture; + assert(insn->Instruction.Texture); + + switch (tex->Texture) { + case TGSI_TEXTURE_BUFFER: + case TGSI_TEXTURE_1D: + case TGSI_TEXTURE_1D_ARRAY: + case TGSI_TEXTURE_SHADOW1D_ARRAY: + return 0x1; + case TGSI_TEXTURE_2D: + case TGSI_TEXTURE_SHADOW2D: + case TGSI_TEXTURE_2D_ARRAY: + case TGSI_TEXTURE_SHADOW2D_ARRAY: + case TGSI_TEXTURE_RECT: + case TGSI_TEXTURE_SHADOWRECT: + case TGSI_TEXTURE_2D_MSAA: + case TGSI_TEXTURE_2D_ARRAY_MSAA: + return 0x3; + case TGSI_TEXTURE_3D: + return 0x7; + default: + assert(!"Unexpected texture target"); + return 0xf; + } +} + unsigned int Instruction::srcMask(unsigned int s) const { unsigned int mask = insn->Dst[0].Register.WriteMask; @@ -955,6 +984,9 @@ private: int inferSysValDirection(unsigned sn) const; bool scanDeclaration(const struct tgsi_full_declaration *); bool scanInstruction(const struct tgsi_full_instruction *); + void scanInstructionSrc(const Instruction& insn, + const Instruction::SrcRegister& src, + unsigned mask); void scanProperty(const struct tgsi_full_property *); void scanImmediate(const struct tgsi_full_immediate *); @@ -1364,6 +1396,61 @@ inline bool Source::isEdgeFlagPassthrough(const Instruction& insn) const insn.getSrc(0).getFile() == TGSI_FILE_INPUT; } +void Source::scanInstructionSrc(const Instruction& insn, +const Instruction::SrcRegister& src, +unsigned mask) +{ + if (src.getFile() == TGSI_FILE_TEMPORARY) { + if (src.isIndirect(0)) + indirectTempArrays.insert(src.getArrayId()); + } else + if (src.getFile() == TGSI_FILE_BUFFER || + src.getFile() == TGSI_FILE_IMAGE || + (src.getFile() == TGSI_FILE_MEMORY && +memoryFiles[src.getIndex(0)].mem_type == TGSI_MEMORY_TYPE_GLOBAL)) { + info->io.globalAccess |= (insn.getOpcode() == TGSI_OPCODE_LOAD) ? + 0x1 : 0x2; + } else + if (src.getFile() == TGSI_FILE_OUTPUT) { + if (src.isIndirect(0)) { + // We don't know which one is accessed, just mark everything for + // reading. This is an extremely unlikely occurrence. + for (unsigned i = 0; i < info->numOutputs; ++i) +info->out[i].oread = 1; + } else { + info->out[src.getIndex(0)].oread = 1; + } + } + if (src.getFile() != TGSI_FILE_INPUT) + return; + + if (src.isIndirect(0)) { + for (unsigned i = 0; i < info->numInputs; ++i) + info->in[i].mask = 0xf; + } else { + const int i = src.getIndex(0); + for (unsigned c = 0; c < 4; ++c) { + if (!(mask & (1 << c))) +continue; + int k = src.getSwizzle(c); + if (k <= TGSI_SWIZZLE_W) +info->in[i].mask |= 1 << k; + } + switch (info->in[i].sn) { + case TGSI_SEMANTIC_PSIZE: + case TGSI_SEMANTIC_PRIMID: + case TGSI_SEMANTIC_FOG: + info->in[i].mask &= 0x1; + break; + case TGSI_SEMANTIC_PCOORD: + info->in[i].mask &= 0x3; + break; + default: + break; + } + } +} + bool Source::scanInstruction(const struct tgsi_full_instruction *inst) { Instruction insn(inst); @@ -1396,66 +1483,19 @@ bool Source::scanInstruction(const struct tgsi_full_instruction *inst) indirectTempArrays.insert(dst.getArrayId()); } else if (dst.getFile() == TGSI_FILE_BUFFER || - dst.getFile() == TGSI_FILE_IMAGE || + dst.getFile() == TGSI_FILE_IMAGE || (dst.getFile() == TGSI_FILE_MEMORY && memoryFiles[dst.getIndex(0)].mem_type == TGSI_MEMORY_TYPE_GLOBAL)) {
Re: [Mesa-dev] [PATCH] anv: drop unused zero macro.
rb On Tue, Oct 18, 2016 at 8:36 PM, Dave Airliewrote: > From: Dave Airlie > > I can't see this being used anywhere. > > Signed-off-by: Dave Airlie > --- > src/intel/vulkan/anv_private.h | 2 -- > 1 file changed, 2 deletions(-) > > diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_ > private.h > index 0e25827..3fe9d7d 100644 > --- a/src/intel/vulkan/anv_private.h > +++ b/src/intel/vulkan/anv_private.h > @@ -163,8 +163,6 @@ anv_clear_mask(uint32_t *inout_mask, uint32_t > clear_mask) > memcpy((dest), (src), (count) * sizeof(*(src))); \ > }) > > -#define zero(x) (memset(&(x), 0, sizeof(x))) > - > /* Define no kernel as 1, since that's an illegal offset for a kernel */ > #define NO_KERNEL 1 > > -- > 2.5.5 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] glapi: Move PrimitiveBoundingBox and BlendBarrier definitions into ES3.2 category.
Ilia Mirkinwrites: > Why does it care where those functions are defined? I thought it was > all one big happy namespace, with the categories just there for > general amusement. Could you shed some light on what the actual > situation is? > Heh, I won't pretend to understand the dispatch generation mess, but apparently the gl_procs.py treats the ES (and GL_OES) categories specially and emits forward declarations for them before the actual table -- Possibly to hack around build failures with GLES entry points not defined in desktop GL headers. > On Tue, Oct 18, 2016 at 11:48 PM, Francisco Jerez > wrote: >> These two GLES 3.2 entry points were being defined in the category of >> the ARB_ES3_2_compatibility and KHR_blend_equation_advanced extensions >> respectively instead of in the ES3.2 category. Defining them in the >> ES3.2 category makes sure that the gl_procs.py generator emits >> declarations in the glprocs.h header file for the unsuffixed GLES-only >> entry points that PrimitiveBoundingBoxARB and BlendBarrierKHR >> respectively alias. This should avoid a compilation failure during >> scons builds in combination with "mapi: export all GLES 3.2 functions >> in libGLESv2.so". >> --- >> src/mapi/glapi/gen/gl_API.xml | 30 +- >> 1 file changed, 17 insertions(+), 13 deletions(-) >> >> diff --git a/src/mapi/glapi/gen/gl_API.xml b/src/mapi/glapi/gen/gl_API.xml >> index 5998ccf..00c9bb7 100644 >> --- a/src/mapi/glapi/gen/gl_API.xml >> +++ b/src/mapi/glapi/gen/gl_API.xml >> @@ -8296,6 +8296,23 @@ >> >> > xmlns:xi="http://www.w3.org/2001/XInclude"/> >> >> + >> + >> + >> + >> + >> + >> + >> + >> + >> + >> + >> + >> + >> + >> + >> + >> >> >> >> @@ -8316,7 +8333,6 @@ >> >> >> >> - >> >> >> >> @@ -8332,18 +8348,6 @@ >> >> >> >> - >> - >> - >> - >> - >> - >> - >> - >> - >> - >> - >> >> >> >> -- >> 2.9.0 >> signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] nv50, nvc0: avoid reading out of bounds when getting bogus so info
The state tracker tries to attach the info to the wrong shader. This is easy enough to protect against. Signed-off-by: Ilia Mirkin--- src/gallium/drivers/nouveau/nv50/nv50_program.c | 3 +++ src/gallium/drivers/nouveau/nvc0/nvc0_program.c | 7 +-- 2 files changed, 8 insertions(+), 2 deletions(-) diff --git a/src/gallium/drivers/nouveau/nv50/nv50_program.c b/src/gallium/drivers/nouveau/nv50/nv50_program.c index 1e39427..9081cd8 100644 --- a/src/gallium/drivers/nouveau/nv50/nv50_program.c +++ b/src/gallium/drivers/nouveau/nv50/nv50_program.c @@ -308,6 +308,9 @@ nv50_program_create_strmout_state(const struct nv50_ir_prog_info *info, const unsigned r = pso->output[i].register_index; b = pso->output[i].output_buffer; + if (r >= info->numOutputs) + continue; + for (c = 0; c < pso->output[i].num_components; ++c) so->map[base[b] + p + c] = info->out[r].slot[s + c]; } diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_program.c b/src/gallium/drivers/nouveau/nvc0/nvc0_program.c index 867d84a..50f8083 100644 --- a/src/gallium/drivers/nouveau/nvc0/nvc0_program.c +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_program.c @@ -509,11 +509,14 @@ nvc0_program_create_tfb_state(const struct nv50_ir_prog_info *info, for (i = 0; i < pso->num_outputs; ++i) { unsigned s = pso->output[i].start_component; unsigned p = pso->output[i].dst_offset; + const unsigned r = pso->output[i].register_index; b = pso->output[i].output_buffer; + if (r >= info->numOutputs) + continue; + for (c = 0; c < pso->output[i].num_components; ++c) - tfb->varying_index[b][p++] = -info->out[pso->output[i].register_index].slot[s + c]; + tfb->varying_index[b][p++] = info->out[r].slot[s + c]; tfb->varying_count[b] = MAX2(tfb->varying_count[b], p); tfb->stream[b] = pso->output[i].stream; -- 2.7.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] glapi: Move PrimitiveBoundingBox and BlendBarrier definitions into ES3.2 category.
Why does it care where those functions are defined? I thought it was all one big happy namespace, with the categories just there for general amusement. Could you shed some light on what the actual situation is? On Tue, Oct 18, 2016 at 11:48 PM, Francisco Jerezwrote: > These two GLES 3.2 entry points were being defined in the category of > the ARB_ES3_2_compatibility and KHR_blend_equation_advanced extensions > respectively instead of in the ES3.2 category. Defining them in the > ES3.2 category makes sure that the gl_procs.py generator emits > declarations in the glprocs.h header file for the unsuffixed GLES-only > entry points that PrimitiveBoundingBoxARB and BlendBarrierKHR > respectively alias. This should avoid a compilation failure during > scons builds in combination with "mapi: export all GLES 3.2 functions > in libGLESv2.so". > --- > src/mapi/glapi/gen/gl_API.xml | 30 +- > 1 file changed, 17 insertions(+), 13 deletions(-) > > diff --git a/src/mapi/glapi/gen/gl_API.xml b/src/mapi/glapi/gen/gl_API.xml > index 5998ccf..00c9bb7 100644 > --- a/src/mapi/glapi/gen/gl_API.xml > +++ b/src/mapi/glapi/gen/gl_API.xml > @@ -8296,6 +8296,23 @@ > > xmlns:xi="http://www.w3.org/2001/XInclude"/> > > + > + > + > + > + > + > + > + > + > + > + > + > + > + > + > + > > > > @@ -8316,7 +8333,6 @@ > > > > - > > > > @@ -8332,18 +8348,6 @@ > > > > - > - > - > - > - > - > - > - > - > - > - > > > > -- > 2.9.0 > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 98172] Concurrent call to glClientWaitSync results in segfault in one of the waiters.
https://bugs.freedesktop.org/show_bug.cgi?id=98172 --- Comment #28 from Suzuki, Shinji--- Yes. I agree with you that we can do without per-sync-object if we allow all waiters enter fence_finish() freely. With that said, per-sync-object mutex has another benefit of potentially reducing lock contention among waiters on differing sync objects and with other mesa components that deals with shared resources. To be fair I also have to mention that ctx->Shared.Mutex is touched everywhere that trying to optimize in this particular context only may not make much sense and adding mutex certainly has associated overhead. Overall, I vote +1 on your strategy if free execution of fence_finish() is to be allowed. On Wed, Oct 19, 2016 at 10:21 AM, wrote: > Comment # 27 on bug 98172 from Michel Dänzer > > Note that if we allow concurrent fence_finish calls, I don't think we need a > per-sync-object mutex. > > > You are receiving this mail because: > > You reported the bug. > You are on the CC list for the bug. -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] Revert "Revert "mapi: export all GLES 3.2 functions in libGLESv2.so""
This reverts commit 85e9bbc14d93fa7166c9ae075ee7ae29a8313e3f. The previous commit should help with the scons build failure caused by the original commit. --- src/mapi/glapi/gen/static_data.py | 12 1 file changed, 12 insertions(+) diff --git a/src/mapi/glapi/gen/static_data.py b/src/mapi/glapi/gen/static_data.py index 2f403e9..25e78bf 100644 --- a/src/mapi/glapi/gen/static_data.py +++ b/src/mapi/glapi/gen/static_data.py @@ -484,17 +484,22 @@ functions = [ "BindVertexBuffer", "BindVertexBuffers", "Bitmap", +"BlendBarrier", "BlendColor", "BlendColorEXT", "BlendEquation", "BlendEquationEXT", +"BlendEquationi", "BlendEquationiARB", "BlendEquationSeparate", +"BlendEquationSeparatei", "BlendEquationSeparateiARB", "BlendFunc", +"BlendFunci", "BlendFunciARB", "BlendFuncSeparate", "BlendFuncSeparateEXT", +"BlendFuncSeparatei", "BlendFuncSeparateiARB", "BlitFramebuffer", "BufferData", @@ -825,6 +830,7 @@ functions = [ "GetFramebufferAttachmentParameteriv", "GetFramebufferAttachmentParameterivEXT", "GetFramebufferParameteriv", +"GetGraphicsResetStatus", "GetGraphicsResetStatusARB", "GetHandleARB", "GetHistogram", @@ -864,8 +870,11 @@ functions = [ "GetnSeparableFilterARB", "GetnTexImageARB", "GetnUniformdvARB", +"GetnUniformfv", "GetnUniformfvARB", +"GetnUniformiv", "GetnUniformivARB", +"GetnUniformuiv", "GetnUniformuivARB", "GetObjectLabel", "GetObjectParameterfvARB", @@ -1160,6 +1169,7 @@ functions = [ "Orthof", "Orthox", "PassThrough", +"PatchParameteri", "PauseTransformFeedback", "PixelMapfv", "PixelMapuiv", @@ -1191,6 +1201,7 @@ functions = [ "PopDebugGroup", "PopMatrix", "PopName", +"PrimitiveBoundingBox", "PrimitiveRestartIndex", "PrimitiveRestartIndexNV", "PrimitiveRestartNV", @@ -1273,6 +1284,7 @@ functions = [ "RasterPos4s", "RasterPos4sv", "ReadBuffer", +"ReadnPixels", "ReadnPixelsARB", "ReadPixels", "Rectd", -- 2.9.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] glapi: Move PrimitiveBoundingBox and BlendBarrier definitions into ES3.2 category.
These two GLES 3.2 entry points were being defined in the category of the ARB_ES3_2_compatibility and KHR_blend_equation_advanced extensions respectively instead of in the ES3.2 category. Defining them in the ES3.2 category makes sure that the gl_procs.py generator emits declarations in the glprocs.h header file for the unsuffixed GLES-only entry points that PrimitiveBoundingBoxARB and BlendBarrierKHR respectively alias. This should avoid a compilation failure during scons builds in combination with "mapi: export all GLES 3.2 functions in libGLESv2.so". --- src/mapi/glapi/gen/gl_API.xml | 30 +- 1 file changed, 17 insertions(+), 13 deletions(-) diff --git a/src/mapi/glapi/gen/gl_API.xml b/src/mapi/glapi/gen/gl_API.xml index 5998ccf..00c9bb7 100644 --- a/src/mapi/glapi/gen/gl_API.xml +++ b/src/mapi/glapi/gen/gl_API.xml @@ -8296,6 +8296,23 @@ http://www.w3.org/2001/XInclude"/> + + + + + + + + + + + + + + + + @@ -8316,7 +8333,6 @@ - @@ -8332,18 +8348,6 @@ - - - - - - - - - - - -- 2.9.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] anv: drop unused zero macro.
From: Dave AirlieI can't see this being used anywhere. Signed-off-by: Dave Airlie --- src/intel/vulkan/anv_private.h | 2 -- 1 file changed, 2 deletions(-) diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h index 0e25827..3fe9d7d 100644 --- a/src/intel/vulkan/anv_private.h +++ b/src/intel/vulkan/anv_private.h @@ -163,8 +163,6 @@ anv_clear_mask(uint32_t *inout_mask, uint32_t clear_mask) memcpy((dest), (src), (count) * sizeof(*(src))); \ }) -#define zero(x) (memset(&(x), 0, sizeof(x))) - /* Define no kernel as 1, since that's an illegal offset for a kernel */ #define NO_KERNEL 1 -- 2.5.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 11/25] mesa/i965/i915/r200: eliminate gl_vertex_program
On Tue, 2016-10-18 at 12:07 -0700, Ian Romanick wrote: > I'd like to see two tiny changes: > > 1. A comment for the IsPositionInvariant field that it can only be > true > for vertex programs. I already had that this is only used for assembly style vertex programs. I've reworded it to be only true for :) > > 2. An assertion or two like > > assert(p->Target == GL_VERTEX_PROGRAM_ARB || > !p->IsPositionInvariant); I'm not sure how useful this is. > > in reasonable places. I'm thinking: > > - Where it's assigned in src/mesa/program/arbprogparse.c It assigned in mesa_parse_arb_vertex_program() and there is already an assert(target == GL_VERTEX_PROGRAM_ARB); > > - Where it's used in src/mesa/state_tracker/st_program.c, Again this is in st_translate_vertex_program() so if its not a vp we already have problems. > src/mesa/drivers/dri/i965/brw_program.c, and Its used inside case GL_VERTEX_PROGRAM_ARB: I've added the assert to the top of the function but it seems kind of pointless. > src/mesa/tnl/t_vb_program.c (both places). In both of these the program always comes from ctx- >VertexProgram._Current so it doesn't seems very useful here either. > > I'd also support a follow-up patch that converts IsPositionInvariant > from GLboolean to bool. :) > > On 10/17/2016 11:12 PM, Timothy Arceri wrote: > > > > Here we move the only field in gl_vertex_program to the > > ARB program fields in gl_program. > > --- > > src/mesa/drivers/common/meta.c | 10 +-- > > src/mesa/drivers/common/meta.h | 2 +- > > src/mesa/drivers/dri/i915/i915_fragprog.c| 4 +- > > src/mesa/drivers/dri/i965/brw_context.h | 8 +-- > > src/mesa/drivers/dri/i965/brw_curbe.c| 2 +- > > src/mesa/drivers/dri/i965/brw_draw.c | 4 +- > > src/mesa/drivers/dri/i965/brw_program.c | 5 +- > > src/mesa/drivers/dri/i965/brw_vs.c | 41 ++-- > > src/mesa/drivers/dri/i965/brw_vs_surface_state.c | 2 +- > > src/mesa/drivers/dri/i965/gen6_vs_state.c| 4 +- > > src/mesa/drivers/dri/r200/r200_context.h | 2 +- > > src/mesa/drivers/dri/r200/r200_state_init.c | 4 +- > > src/mesa/drivers/dri/r200/r200_tcl.c | 2 +- > > src/mesa/drivers/dri/r200/r200_vertprog.c| 82 > > > > src/mesa/main/arbprogram.c | 19 +++--- > > src/mesa/main/context.c | 8 +-- > > src/mesa/main/ff_fragment_shader.cpp | 2 +- > > src/mesa/main/ffvertex_prog.c| 72 ++ > > --- > > src/mesa/main/ffvertex_prog.h| 2 +- > > src/mesa/main/mtypes.h | 17 ++--- > > src/mesa/main/shared.c | 5 +- > > src/mesa/main/state.c| 26 > > src/mesa/main/state.h| 2 +- > > src/mesa/program/arbprogparse.c | 46 ++ > > --- > > src/mesa/program/arbprogparse.h | 2 +- > > src/mesa/program/prog_statevars.c| 8 +-- > > src/mesa/program/program.c | 15 ++--- > > src/mesa/program/program.h | 26 > > src/mesa/program/programopt.c| 42 ++-- > > src/mesa/program/programopt.h| 2 +- > > src/mesa/state_tracker/st_atom.c | 4 +- > > src/mesa/state_tracker/st_atom_constbuf.c| 2 +- > > src/mesa/state_tracker/st_atom_rasterizer.c | 8 +-- > > src/mesa/state_tracker/st_atom_sampler.c | 2 +- > > src/mesa/state_tracker/st_atom_shader.c | 4 +- > > src/mesa/state_tracker/st_atom_texture.c | 2 +- > > src/mesa/state_tracker/st_cb_feedback.c | 2 +- > > src/mesa/state_tracker/st_cb_program.c | 2 +- > > src/mesa/state_tracker/st_debug.c| 4 +- > > src/mesa/state_tracker/st_program.c | 35 +- > > src/mesa/state_tracker/st_program.h | 4 +- > > src/mesa/tnl/t_context.c | 4 +- > > src/mesa/tnl/t_vb_program.c | 24 +++ > > src/mesa/tnl/t_vp_build.c| 4 +- > > src/mesa/vbo/vbo_exec_draw.c | 4 +- > > src/mesa/vbo/vbo_save_draw.c | 4 +- > > 46 files changed, 264 insertions(+), 311 deletions(-) > > > > diff --git a/src/mesa/drivers/common/meta.c > > b/src/mesa/drivers/common/meta.c > > index 890e98a..ab81eed 100644 > > --- a/src/mesa/drivers/common/meta.c > > +++ b/src/mesa/drivers/common/meta.c > > @@ -566,8 +566,8 @@ _mesa_meta_begin(struct gl_context *ctx, > > GLbitfield state) > > > > if (ctx->Extensions.ARB_vertex_program) { > > save->VertexProgramEnabled = ctx->VertexProgram.Enabled; > > - _mesa_reference_vertprog(ctx, >VertexProgram, > > -
[Mesa-dev] [Bug 98172] Concurrent call to glClientWaitSync results in segfault in one of the waiters.
https://bugs.freedesktop.org/show_bug.cgi?id=98172 --- Comment #27 from Michel Dänzer--- Note that if we allow concurrent fence_finish calls, I don't think we need a per-sync-object mutex. -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 98172] Concurrent call to glClientWaitSync results in segfault in one of the waiters.
https://bugs.freedesktop.org/show_bug.cgi?id=98172 --- Comment #26 from Michel Dänzer--- (In reply to Marek Olšák from comment #24) > Hm. Probably none. Actually, I think there are: E.g. consider one thread calling glClientWaitSync with a non-0 timeout, blocking for some time with the mutex locked. If another thread calls glClientWaitSync with a 0 timeout (or whichever API call ends up in st_check_sync) during that time, it'll block until the first thread unlocks the mutex. -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] glsl: optimize list handling in opt_dead_code
Hi Jan, On 18.10.2016 00:07, Jan Ziak wrote: This patch replaces the ir_variable_refcount_entry's linked-list with an array-list. The array-list has local storage which does not require ANY additional allocations if the list has small number of elements. The size of this storage is configurable for each variable. Benchmark results for "./run -1 shaders" from shader-db[1]: - The total number of executed instructions goes down from 64.184 to 63.797 giga-instructions when Mesa is compiled with "gcc -O0 ..." - In the call tree starting at function do_dead_code(): - the number of calls to malloc() is reduced by about 10% - the number of calls to free() is reduced by about 30% [1] git://anongit.freedesktop.org/mesa/shader-db Signed-off-by: Jan Ziak (http://atom-symbol.net) <0xe2.0x9a.0...@gmail.com> --- src/compiler/glsl/ir_variable_refcount.cpp | 14 +-- src/compiler/glsl/ir_variable_refcount.h | 8 +- src/compiler/glsl/opt_dead_code.cpp| 19 ++-- src/util/fast_list.h | 167 + 4 files changed, 176 insertions(+), 32 deletions(-) diff --git a/src/compiler/glsl/ir_variable_refcount.cpp b/src/compiler/glsl/ir_variable_refcount.cpp index 8306be1..94d6edc 100644 --- a/src/compiler/glsl/ir_variable_refcount.cpp +++ b/src/compiler/glsl/ir_variable_refcount.cpp @@ -46,15 +46,6 @@ static void free_entry(struct hash_entry *entry) { ir_variable_refcount_entry *ivre = (ir_variable_refcount_entry *) entry->data; - - /* Free assignment list */ - exec_node *n; - while ((n = ivre->assign_list.pop_head()) != NULL) { - struct assignment_entry *assignment_entry = - exec_node_data(struct assignment_entry, n, link); - free(assignment_entry); - } - delete ivre; } @@ -142,10 +133,7 @@ ir_variable_refcount_visitor::visit_leave(ir_assignment *ir) */ assert(entry->referenced_count >= entry->assigned_count); if (entry->referenced_count == entry->assigned_count) { - struct assignment_entry *assignment_entry = -(struct assignment_entry *)calloc(1, sizeof(*assignment_entry)); - assignment_entry->assign = ir; - entry->assign_list.push_head(_entry->link); + entry->assign_list.add(ir); } } diff --git a/src/compiler/glsl/ir_variable_refcount.h b/src/compiler/glsl/ir_variable_refcount.h index 08a11c0..c3ec5fe 100644 --- a/src/compiler/glsl/ir_variable_refcount.h +++ b/src/compiler/glsl/ir_variable_refcount.h @@ -32,11 +32,7 @@ #include "ir.h" #include "ir_visitor.h" #include "compiler/glsl_types.h" - -struct assignment_entry { - exec_node link; - ir_assignment *assign; -}; +#include "util/fast_list.h" class ir_variable_refcount_entry { @@ -50,7 +46,7 @@ public: * This is intended to be used for dead code optimisation and may * not be a complete list. */ - exec_list assign_list; + arraylistassign_list; /** Number of times the variable is referenced, including assignments. */ unsigned referenced_count; diff --git a/src/compiler/glsl/opt_dead_code.cpp b/src/compiler/glsl/opt_dead_code.cpp index 75e668a..06e8c3d 100644 --- a/src/compiler/glsl/opt_dead_code.cpp +++ b/src/compiler/glsl/opt_dead_code.cpp @@ -52,7 +52,7 @@ do_dead_code(exec_list *instructions, bool uniform_locations_assigned) struct hash_entry *e; hash_table_foreach(v.ht, e) { - ir_variable_refcount_entry *entry = (ir_variable_refcount_entry *)e->data; + ir_variable_refcount_entry *const entry = (ir_variable_refcount_entry *)e->data; /* Since each assignment is a reference, the refereneced count must be * greater than or equal to the assignment count. If they are equal, @@ -89,7 +89,7 @@ do_dead_code(exec_list *instructions, bool uniform_locations_assigned) if (entry->var->data.always_active_io) continue; - if (!entry->assign_list.is_empty()) { + if (!entry->assign_list.empty()) { /* Remove all the dead assignments to the variable we found. * Don't do so if it's a shader or function output, though. */ @@ -98,26 +98,19 @@ do_dead_code(exec_list *instructions, bool uniform_locations_assigned) entry->var->data.mode != ir_var_shader_out && entry->var->data.mode != ir_var_shader_storage) { -while (!entry->assign_list.is_empty()) { - struct assignment_entry *assignment_entry = - exec_node_data(struct assignment_entry, - entry->assign_list.get_head_raw(), link); - - assignment_entry->assign->remove(); - +for(ir_assignment *assign : entry->assign_list) { The original code separates control flow instructions as for or while with a space before the brace, aka "for (...". This applies for all the code. + assign->remove(); if (debug)
Re: [Mesa-dev] glsl: optimize list handling in opt_dead_code
On wtorek, 18 października 2016 00:07:18 CEST Jan Ziak wrote: > This patch replaces the ir_variable_refcount_entry's linked-list > with an array-list. > > The array-list has local storage which does not require ANY additional > allocations if the list has small number of elements. The size of this > storage is configurable for each variable. > > Benchmark results for "./run -1 shaders" from shader-db[1]: > > - The total number of executed instructions goes down from 64.184 to 63.797 > giga-instructions when Mesa is compiled with "gcc -O0 ..." Hi, A total number of instructions in -O0 is not a good indicator of whether this change is beneficial from performance POV. You should check it with -O2 or whatever is the default in mesa release builds. > - In the call tree starting at function do_dead_code(): > - the number of calls to malloc() is reduced by about 10% > - the number of calls to free() is reduced by about 30% These are certainly a win. > > [1] git://anongit.freedesktop.org/mesa/shader-db > > Signed-off-by: Jan Ziak (http://atom-symbol.net) <0xe2.0x9a.0...@gmail.com > > --- > src/compiler/glsl/ir_variable_refcount.cpp | 14 +-- > src/compiler/glsl/ir_variable_refcount.h | 8 +- > src/compiler/glsl/opt_dead_code.cpp| 19 ++-- > src/util/fast_list.h | 167 + > 4 files changed, 176 insertions(+), 32 deletions(-) > > diff --git a/src/compiler/glsl/ir_variable_refcount.cpp b/src/compiler/glsl/ir_variable_refcount.cpp > index 8306be1..94d6edc 100644 > --- a/src/compiler/glsl/ir_variable_refcount.cpp > +++ b/src/compiler/glsl/ir_variable_refcount.cpp > @@ -46,15 +46,6 @@ static void > free_entry(struct hash_entry *entry) > { > ir_variable_refcount_entry *ivre = (ir_variable_refcount_entry *) entry->data; > - > - /* Free assignment list */ > - exec_node *n; > - while ((n = ivre->assign_list.pop_head()) != NULL) { > - struct assignment_entry *assignment_entry = > - exec_node_data(struct assignment_entry, n, link); > - free(assignment_entry); > - } > - > delete ivre; > } > > @@ -142,10 +133,7 @@ ir_variable_refcount_visitor::visit_leave(ir_assignment *ir) > */ >assert(entry->referenced_count >= entry->assigned_count); >if (entry->referenced_count == entry->assigned_count) { > - struct assignment_entry *assignment_entry = > -(struct assignment_entry *)calloc(1, sizeof(*assignment_entry)); > - assignment_entry->assign = ir; > - entry->assign_list.push_head(_entry->link); > + entry->assign_list.add(ir); >} > } > > diff --git a/src/compiler/glsl/ir_variable_refcount.h b/src/compiler/glsl/ir_variable_refcount.h > index 08a11c0..c3ec5fe 100644 > --- a/src/compiler/glsl/ir_variable_refcount.h > +++ b/src/compiler/glsl/ir_variable_refcount.h > @@ -32,11 +32,7 @@ > #include "ir.h" > #include "ir_visitor.h" > #include "compiler/glsl_types.h" > - > -struct assignment_entry { > - exec_node link; > - ir_assignment *assign; > -}; > +#include "util/fast_list.h" > > class ir_variable_refcount_entry > { > @@ -50,7 +46,7 @@ public: > * This is intended to be used for dead code optimisation and may > * not be a complete list. > */ > - exec_list assign_list; > + arraylistassign_list; > > /** Number of times the variable is referenced, including assignments. */ > unsigned referenced_count; > diff --git a/src/compiler/glsl/opt_dead_code.cpp b/src/compiler/glsl/opt_dead_code.cpp > index 75e668a..06e8c3d 100644 > --- a/src/compiler/glsl/opt_dead_code.cpp > +++ b/src/compiler/glsl/opt_dead_code.cpp > @@ -52,7 +52,7 @@ do_dead_code(exec_list *instructions, bool uniform_locations_assigned) > > struct hash_entry *e; > hash_table_foreach(v.ht, e) { > - ir_variable_refcount_entry *entry = (ir_variable_refcount_entry *)e->data; > + ir_variable_refcount_entry *const entry = (ir_variable_refcount_entry *)e->data; > >/* Since each assignment is a reference, the refereneced count must be > * greater than or equal to the assignment count. If they are equal, > @@ -89,7 +89,7 @@ do_dead_code(exec_list *instructions, bool uniform_locations_assigned) >if (entry->var->data.always_active_io) > continue; > > - if (!entry->assign_list.is_empty()) { > + if (!entry->assign_list.empty()) { > /* Remove all the dead assignments to the variable we found. >* Don't do so if it's a shader or function output, though. >*/ > @@ -98,26 +98,19 @@ do_dead_code(exec_list *instructions, bool uniform_locations_assigned) > entry->var->data.mode != ir_var_shader_out && > entry->var->data.mode != ir_var_shader_storage) { > > -while (!entry->assign_list.is_empty()) { > - struct assignment_entry *assignment_entry = > - exec_node_data(struct assignment_entry, > -
Re: [Mesa-dev] [PATCH v2 103/103] i965/gen7: expose OpenGL 4.0 on Haswell
On Tuesday, October 18, 2016 5:12:27 PM PDT Ian Romanick wrote: > On 10/11/2016 02:02 AM, Iago Toral Quiroga wrote: > > ARB_gpu_shader_fp64 was the last piece missing. Notice that some > > hardware and kernel combinations do not support pipelined register > > writes, which are required for some OpenGL 4.0 features, in which > > case the driver won't expose 4.0. > > --- > > src/mesa/drivers/dri/i965/intel_extensions.c | 2 ++ > > src/mesa/drivers/dri/i965/intel_screen.c | 2 +- > > 2 files changed, 3 insertions(+), 1 deletion(-) > > > > diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c > > b/src/mesa/drivers/dri/i965/intel_extensions.c > > index 0491145..a291cd5 100644 > > --- a/src/mesa/drivers/dri/i965/intel_extensions.c > > +++ b/src/mesa/drivers/dri/i965/intel_extensions.c > > @@ -272,6 +272,8 @@ intelInitExtensions(struct gl_context *ctx) > > > > if (brw->gen >= 8) > >ctx->Const.GLSLVersion = 440; > > + else if (brw->is_haswell) > > + ctx->Const.GLSLVersion = 400; > > else if (brw->gen >= 6) > >ctx->Const.GLSLVersion = 330; > > else > > diff --git a/src/mesa/drivers/dri/i965/intel_screen.c > > b/src/mesa/drivers/dri/i965/intel_screen.c > > index 9b23bac..1af7fe6 100644 > > --- a/src/mesa/drivers/dri/i965/intel_screen.c > > +++ b/src/mesa/drivers/dri/i965/intel_screen.c > > @@ -1445,7 +1445,7 @@ set_max_gl_versions(struct intel_screen *screen) > >dri_screen->max_gl_es2_version = has_astc ? 32 : 31; > >break; > > case 7: > > - dri_screen->max_gl_core_version = 33; > > + dri_screen->max_gl_core_version = screen->devinfo.is_haswell ? 40 : > > 33; > > I *think* this needs to take the pipelined register writes into > consideration. My understanding is if you say 40 here, then > glXCreateContextAttribs will allow creation of an OpenGL 4.0 context... > but the context may only be 3.3. Good catch, Ian. Checking brw->can_do_pipelined_register_writes here would be right...but it's awkward, since it's stored in the context, and doesn't get populated until we actually make a context and run things on the GPU. That's probably not too feasible here in screen init time, where we're trying to decide what kind of contexts to even support. To make life easier, I might just do: dri_screen->max_gl_core_version = screen->has_mi_math_and_lrr ? 40 : 33; which is the check we use for ARB_query_buffer_object. On Haswell, it implies a high enough command parser version that we can do everything we need. (We could actually get away with an older kernel version, but I'm not sure I care...as we move toward 4.1/4.2/4.3 we'd need to bump it higher anyway...) The one gotcha is that has_mi_math_and_lrr / cmd_parser_version get initialized after set_max_gl_versions() is called, so you'll need to reorder those in the caller. Should be straightforward. signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 98271] [radeonsi]Playing videos with vdpau or vaapi hardware acceleration crashes my pc
https://bugs.freedesktop.org/show_bug.cgi?id=98271 --- Comment #20 from John--- > Installing an older kernel, see if that works with 12.0 mesa. > If yes we have narrowed it down to the kernel, if not we > need to stick a bit more into mesa. I've tried with a 3.18 kernel and still got the issue, so the issue is not in the kernel. I had the firmware files from that date as well to eliminate that possibility. > Another possibility which came to my mind is that this might not > we an issue with UVD decoding, but rather presenting it. > E.g. install both VDPAU and OpenGL from a certain Mesa version > *AND* make sure that you restart X after that so that the > X acceleration uses the new library versions as well. Now this is interesting, as the reboot were only post-freeze so never to test a certain mesa version. I've rolled back to 11 and restarted the computer and will try. Since you mentioned presenting, could it be the DDX? New information: I don't need to have the video on screen for the issue to happen. I can alt-tab or switch to another virtual desktop while the script runs and it still freezes. -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 021/103] i965/vec4: implement double unpacking
This patch is Reviewed-by: Ian RomanickOn 10/11/2016 02:01 AM, Iago Toral Quiroga wrote: > --- > src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 12 > 1 file changed, 12 insertions(+) > > diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp > b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp > index 04f70ef..2631bf3 100644 > --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp > +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp > @@ -1538,6 +1538,18 @@ vec4_visitor::nir_emit_alu(nir_alu_instr *instr) >break; > } > > + case nir_op_unpack_double_2x32_split_x: > + case nir_op_unpack_double_2x32_split_y: { > + enum opcode oper = (instr->op == nir_op_unpack_double_2x32_split_x) ? > + VEC4_OPCODE_PICK_LOW_32BIT : VEC4_OPCODE_PICK_HIGH_32BIT; > + dst_reg tmp = dst_reg(this, glsl_type::dvec4_type); > + emit(MOV(tmp, op[0])); > + dst_reg tmp2 = dst_reg(this, glsl_type::uvec4_type); > + emit(oper, tmp2, src_reg(tmp)); > + emit(MOV(dst, src_reg(tmp2))); > + break; > + } > + > case nir_op_unpack_half_2x16: >/* As NIR does not guarantee that we have a correct swizzle outside the > * boundaries of a vector, and the implementation of > emit_unpack_half_2x16 > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 020/103] i965/vec4: don't copy propagate vector opcodes that operate in align1 mode
This patch is Reviewed-by: Ian RomanickOn 10/11/2016 02:01 AM, Iago Toral Quiroga wrote: > Basically, ALIGN1 mode will ignore swizzles on the input vectors so we don't > want the copy propagation pass to mess with them. > --- > .../drivers/dri/i965/brw_vec4_copy_propagation.cpp | 24 > ++ > 1 file changed, 24 insertions(+) > > diff --git a/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp > b/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp > index 545f4c7..d0045a7 100644 > --- a/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp > +++ b/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp > @@ -283,6 +283,22 @@ try_constant_propagate(const struct gen_device_info > *devinfo, > } > > static bool > +is_align1_opcode(unsigned opcode) > +{ > + switch (opcode) { > + case VEC4_OPCODE_DOUBLE_TO_FLOAT: > + case VEC4_OPCODE_FLOAT_TO_DOUBLE: > + case VEC4_OPCODE_PICK_LOW_32BIT: > + case VEC4_OPCODE_PICK_HIGH_32BIT: > + case VEC4_OPCODE_SET_LOW_32BIT: > + case VEC4_OPCODE_SET_HIGH_32BIT: > + return true; > + default: > + return false; > + } > +} > + > +static bool > try_copy_propagate(const struct gen_device_info *devinfo, > vec4_instruction *inst, int arg, > const copy_entry *entry, int attributes_per_reg) > @@ -326,6 +342,14 @@ try_copy_propagate(const struct gen_device_info *devinfo, > > unsigned composed_swizzle = brw_compose_swizzle(inst->src[arg].swizzle, > value.swizzle); > + > + /* Instructions that operate on vectors in ALIGN1 mode will ignore > swizzles > +* so copy-propagation won't be safe if the composed swizzle is anything > +* other than the identity. > +*/ > + if (is_align1_opcode(inst->opcode) && composed_swizzle != > BRW_SWIZZLE_XYZW) > + return false; > + > if (inst->is_3src(devinfo) && > (value.file == UNIFORM || > (value.file == ATTR && attributes_per_reg != 1)) && > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 019/103] i965/vec4: Fix DCE for VEC4_OPCODE_SET_{LOW, HIGH}_32BIT
This patch is Reviewed-by: Ian RomanickOn 10/11/2016 02:01 AM, Iago Toral Quiroga wrote: > These align1 opcodes do partial writes of 64-bit data. The problem is that we > want to use them to write on the same register to implement packDouble2x32 and > from the point of view of DCE, since both opcodes write to the same register, > only the last one stands and decides to eliminate the first, which is > not correct, so prevent this from happening. > > v2: Make a helper in vec4_instruction to know if the instruction is an > align1 partial write. This will come in handy when we implement a > simd splitting pass in a later patch. > --- > src/mesa/drivers/dri/i965/brw_ir_vec4.h| 6 ++ > src/mesa/drivers/dri/i965/brw_vec4_dead_code_eliminate.cpp | 3 ++- > 2 files changed, 8 insertions(+), 1 deletion(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_ir_vec4.h > b/src/mesa/drivers/dri/i965/brw_ir_vec4.h > index a8e5f4a..7451f44 100644 > --- a/src/mesa/drivers/dri/i965/brw_ir_vec4.h > +++ b/src/mesa/drivers/dri/i965/brw_ir_vec4.h > @@ -232,6 +232,12 @@ public: > bool can_change_types() const; > bool has_source_and_destination_hazard() const; > > + bool is_align1_partial_write() > + { > + return opcode == VEC4_OPCODE_SET_LOW_32BIT || > + opcode == VEC4_OPCODE_SET_HIGH_32BIT; > + } > + > bool reads_flag() > { >return predicate || opcode == VS_OPCODE_UNPACK_FLAGS_SIMD4X2; > diff --git a/src/mesa/drivers/dri/i965/brw_vec4_dead_code_eliminate.cpp > b/src/mesa/drivers/dri/i965/brw_vec4_dead_code_eliminate.cpp > index 50706a9..950c6c8 100644 > --- a/src/mesa/drivers/dri/i965/brw_vec4_dead_code_eliminate.cpp > +++ b/src/mesa/drivers/dri/i965/brw_vec4_dead_code_eliminate.cpp > @@ -109,7 +109,8 @@ vec4_visitor::dead_code_eliminate() > } > } > > - if (inst->dst.file == VGRF && !inst->predicate) { > + if (inst->dst.file == VGRF && !inst->predicate && > + !inst->is_align1_partial_write()) { > for (unsigned i = 0; i < regs_written(inst); i++) { > for (int c = 0; c < 4; c++) { >if (inst->dst.writemask & (1 << c)) { > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 017/103] i965/vec4: add VEC4_OPCODE_PICK_{LOW, HIGH}_32BIT opcodes
This patch is Reviewed-by: Ian RomanickWe may be able to eliminate some of this after I do int64 support. It might be cleaner to do unpackInt2x32(doubleBitsToInt64(x)) at a higher level of the compiler instead. On 10/11/2016 02:01 AM, Iago Toral Quiroga wrote: > These opcodes will pick the low/high 32-bit in each 64-bit data element > using Align1 mode. We will use this, for example, to do things like > unpackDouble2x32. > > We use Align1 mode because in order to implement this in Align16 mode > we would need to use 32-bit logical swizzles (XZ for low, YW for high), > but the IR works in terms of 64-bit logical swizzles for DF operands > all the way up to codegen. > > v2: > - use suboffset() instead of get_element_ud() > - no need to set the width on the dst > --- > src/mesa/drivers/dri/i965/brw_defines.h | 2 ++ > src/mesa/drivers/dri/i965/brw_shader.cpp | 4 > src/mesa/drivers/dri/i965/brw_vec4.cpp | 4 > src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 25 > > 4 files changed, 35 insertions(+) > > diff --git a/src/mesa/drivers/dri/i965/brw_defines.h > b/src/mesa/drivers/dri/i965/brw_defines.h > index 79b96a4..8ffb50c 100644 > --- a/src/mesa/drivers/dri/i965/brw_defines.h > +++ b/src/mesa/drivers/dri/i965/brw_defines.h > @@ -1100,6 +1100,8 @@ enum opcode { > VEC4_OPCODE_UNPACK_UNIFORM, > VEC4_OPCODE_DOUBLE_TO_FLOAT, > VEC4_OPCODE_FLOAT_TO_DOUBLE, > + VEC4_OPCODE_PICK_LOW_32BIT, > + VEC4_OPCODE_PICK_HIGH_32BIT, > > FS_OPCODE_DDX_COARSE, > FS_OPCODE_DDX_FINE, > diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp > b/src/mesa/drivers/dri/i965/brw_shader.cpp > index b063f77..b2f3a56 100644 > --- a/src/mesa/drivers/dri/i965/brw_shader.cpp > +++ b/src/mesa/drivers/dri/i965/brw_shader.cpp > @@ -321,6 +321,10 @@ brw_instruction_name(const struct gen_device_info > *devinfo, enum opcode op) >return "double_to_float"; > case VEC4_OPCODE_FLOAT_TO_DOUBLE: >return "float_to_double"; > + case VEC4_OPCODE_PICK_LOW_32BIT: > + return "pick_low_32bit"; > + case VEC4_OPCODE_PICK_HIGH_32BIT: > + return "pick_high_32bit"; > > case FS_OPCODE_DDX_COARSE: >return "ddx_coarse"; > diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp > b/src/mesa/drivers/dri/i965/brw_vec4.cpp > index 40f8702..4fd04f1 100644 > --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp > +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp > @@ -255,6 +255,8 @@ vec4_instruction::can_do_writemask(const struct > gen_device_info *devinfo) > case SHADER_OPCODE_GEN4_SCRATCH_READ: > case VEC4_OPCODE_DOUBLE_TO_FLOAT: > case VEC4_OPCODE_FLOAT_TO_DOUBLE: > + case VEC4_OPCODE_PICK_LOW_32BIT: > + case VEC4_OPCODE_PICK_HIGH_32BIT: > case VS_OPCODE_PULL_CONSTANT_LOAD: > case VS_OPCODE_PULL_CONSTANT_LOAD_GEN7: > case VS_OPCODE_SET_SIMD4X2_HEADER_GEN9: > @@ -510,6 +512,8 @@ vec4_visitor::opt_reduce_swizzle() > >case VEC4_OPCODE_FLOAT_TO_DOUBLE: >case VEC4_OPCODE_DOUBLE_TO_FLOAT: > + case VEC4_OPCODE_PICK_LOW_32BIT: > + case VEC4_OPCODE_PICK_HIGH_32BIT: > swizzle = brw_swizzle_for_size(4); > break; > > diff --git a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp > b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp > index 6f4c438..b8778c4 100644 > --- a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp > +++ b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp > @@ -1940,6 +1940,31 @@ generate_code(struct brw_codegen *p, > break; >} > > + case VEC4_OPCODE_PICK_LOW_32BIT: > + case VEC4_OPCODE_PICK_HIGH_32BIT: { > + /* Stores the low/high 32-bit of each 64-bit element in src[0] into > + * dst using ALIGN1 mode and a <8,4,2>:UD region on the source. > + */ > + assert(type_sz(src[0].type) == 8); > + assert(type_sz(dst.type) == 4); > + > + brw_set_default_access_mode(p, BRW_ALIGN_1); > + > + dst = retype(dst, BRW_REGISTER_TYPE_UD); > + dst.hstride = BRW_HORIZONTAL_STRIDE_1; > + > + src[0] = retype(src[0], BRW_REGISTER_TYPE_UD); > + if (inst->opcode == VEC4_OPCODE_PICK_HIGH_32BIT) > +src[0] = suboffset(src[0], 1); > + src[0].vstride = BRW_VERTICAL_STRIDE_8; > + src[0].width = BRW_WIDTH_4; > + src[0].hstride = BRW_HORIZONTAL_STRIDE_2; > + brw_MOV(p, dst, src[0]); > + > + brw_set_default_access_mode(p, BRW_ALIGN_16); > + break; > + } > + >case VEC4_OPCODE_PACK_BYTES: { > /* Is effectively: >* > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 018/103] i965/vec4: add VEC4_OPCODE_SET_{LOW, HIGH}_32BIT opcodes
This patch is Reviewed-by: Ian RomanickOn 10/11/2016 02:01 AM, Iago Toral Quiroga wrote: > These opcodes will set the low/high 32-bit in each 64-bit data element > using Align1 mode. We will use this to implement packDouble2x32. > > We use Align1 mode because in order to implement this in Align16 mode > we would need to use 32-bit logical swizzles (XZ for low, YW for high), > but the IR works in terms of 64-bit logical swizzles for DF operands > all the way up to codegen. > > v2: > - use suboffset() instead of get_element_ud() > - no need to set the width on the dst > --- > src/mesa/drivers/dri/i965/brw_defines.h | 2 ++ > src/mesa/drivers/dri/i965/brw_shader.cpp | 4 > src/mesa/drivers/dri/i965/brw_vec4.cpp | 4 > src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 25 > > 4 files changed, 35 insertions(+) > > diff --git a/src/mesa/drivers/dri/i965/brw_defines.h > b/src/mesa/drivers/dri/i965/brw_defines.h > index 8ffb50c..35d638c 100644 > --- a/src/mesa/drivers/dri/i965/brw_defines.h > +++ b/src/mesa/drivers/dri/i965/brw_defines.h > @@ -1102,6 +1102,8 @@ enum opcode { > VEC4_OPCODE_FLOAT_TO_DOUBLE, > VEC4_OPCODE_PICK_LOW_32BIT, > VEC4_OPCODE_PICK_HIGH_32BIT, > + VEC4_OPCODE_SET_LOW_32BIT, > + VEC4_OPCODE_SET_HIGH_32BIT, > > FS_OPCODE_DDX_COARSE, > FS_OPCODE_DDX_FINE, > diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp > b/src/mesa/drivers/dri/i965/brw_shader.cpp > index b2f3a56..153bd43 100644 > --- a/src/mesa/drivers/dri/i965/brw_shader.cpp > +++ b/src/mesa/drivers/dri/i965/brw_shader.cpp > @@ -325,6 +325,10 @@ brw_instruction_name(const struct gen_device_info > *devinfo, enum opcode op) >return "pick_low_32bit"; > case VEC4_OPCODE_PICK_HIGH_32BIT: >return "pick_high_32bit"; > + case VEC4_OPCODE_SET_LOW_32BIT: > + return "set_low_32bit"; > + case VEC4_OPCODE_SET_HIGH_32BIT: > + return "set_high_32bit"; > > case FS_OPCODE_DDX_COARSE: >return "ddx_coarse"; > diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp > b/src/mesa/drivers/dri/i965/brw_vec4.cpp > index 4fd04f1..06fa38f 100644 > --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp > +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp > @@ -257,6 +257,8 @@ vec4_instruction::can_do_writemask(const struct > gen_device_info *devinfo) > case VEC4_OPCODE_FLOAT_TO_DOUBLE: > case VEC4_OPCODE_PICK_LOW_32BIT: > case VEC4_OPCODE_PICK_HIGH_32BIT: > + case VEC4_OPCODE_SET_LOW_32BIT: > + case VEC4_OPCODE_SET_HIGH_32BIT: > case VS_OPCODE_PULL_CONSTANT_LOAD: > case VS_OPCODE_PULL_CONSTANT_LOAD_GEN7: > case VS_OPCODE_SET_SIMD4X2_HEADER_GEN9: > @@ -514,6 +516,8 @@ vec4_visitor::opt_reduce_swizzle() >case VEC4_OPCODE_DOUBLE_TO_FLOAT: >case VEC4_OPCODE_PICK_LOW_32BIT: >case VEC4_OPCODE_PICK_HIGH_32BIT: > + case VEC4_OPCODE_SET_LOW_32BIT: > + case VEC4_OPCODE_SET_HIGH_32BIT: > swizzle = brw_swizzle_for_size(4); > break; > > diff --git a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp > b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp > index b8778c4..120797b 100644 > --- a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp > +++ b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp > @@ -1965,6 +1965,31 @@ generate_code(struct brw_codegen *p, > break; >} > > + case VEC4_OPCODE_SET_LOW_32BIT: > + case VEC4_OPCODE_SET_HIGH_32BIT: { > + /* Reads consecutive 32-bit elements from src[0] and writes > + * them to the low/high 32-bit of each 64-bit element in dst. > + */ > + assert(type_sz(src[0].type) == 4); > + assert(type_sz(dst.type) == 8); > + > + brw_set_default_access_mode(p, BRW_ALIGN_1); > + > + dst = retype(dst, BRW_REGISTER_TYPE_UD); > + if (inst->opcode == VEC4_OPCODE_SET_HIGH_32BIT) > +dst = suboffset(dst, 1); > + dst.hstride = BRW_HORIZONTAL_STRIDE_2; > + > + src[0] = retype(src[0], BRW_REGISTER_TYPE_UD); > + src[0].vstride = BRW_VERTICAL_STRIDE_4; > + src[0].width = BRW_WIDTH_4; > + src[0].hstride = BRW_HORIZONTAL_STRIDE_1; > + brw_MOV(p, dst, src[0]); > + > + brw_set_default_access_mode(p, BRW_ALIGN_16); > + break; > + } > + >case VEC4_OPCODE_PACK_BYTES: { > /* Is effectively: >* > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 013/103] i965/vec4: set correct register regions for 32-bit and 64-bit
On 10/11/2016 02:01 AM, Iago Toral Quiroga wrote: > For 32-bit instructions we want to use <4,4,1> regions for VGRF > sources so we should really set a width of 4 (we were setting 8). > > For 64-bit instructions we want to use a width of 2 because the > hardware uses 32-bit swizzles, meaning that we can only address 2 > consecutive 64-bit components in a row. Also, Curro suggested that > the hardware is probably fixing the width to 2 for 64-bit instructions > anyway, so just go with that and use <2,2,1>. > > v2: > - No need to explicitly set the vertical stride of 64-bit regions to 2, >brw_vecn_grf with a width of 2 will do that for us. > - No need to adjust the width of dst registers. > > Signed-off-by: Connor Abbott> --- > src/mesa/drivers/dri/i965/brw_vec4.cpp | 13 + > 1 file changed, 9 insertions(+), 4 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp > b/src/mesa/drivers/dri/i965/brw_vec4.cpp > index 32c04b2..40f8702 100644 > --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp > +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp > @@ -1873,20 +1873,24 @@ vec4_visitor::convert_to_hw_regs() > struct src_reg = inst->src[i]; > struct brw_reg reg; > switch (src.file) { > - case VGRF: > -reg = byte_offset(brw_vec8_grf(src.nr, 0), src.offset); > + case VGRF: { > +unsigned type_size = type_sz(src.type); > +unsigned width = REG_SIZE / 2 / MAX2(4, type_size); constify these > +reg = byte_offset(brw_vecn_grf(width, src.nr, 0), src.offset); > reg.type = src.type; > reg.swizzle = src.swizzle; > reg.abs = src.abs; > reg.negate = src.negate; > break; > + } > > - case UNIFORM: > + case UNIFORM: { > +unsigned width = REG_SIZE / 2 / MAX2(4, type_sz(src.type)); constify this one too, and this patch is Reviewed-by: Ian Romanick > reg = stride(byte_offset(brw_vec4_grf( > > prog_data->base.dispatch_grf_start_reg + > src.nr / 2, src.nr % 2 * 4), > src.offset), > - 0, 4, 1); > + 0, width, 1); > reg.type = src.type; > reg.swizzle = src.swizzle; > reg.abs = src.abs; > @@ -1895,6 +1899,7 @@ vec4_visitor::convert_to_hw_regs() > /* This should have been moved to pull constants. */ > assert(!src.reladdr); > break; > + } > > case ARF: > case FIXED_GRF: > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 007/103] i965/vec4/nir: fix emitting 64-bit immediates
On 10/18/2016 05:26 PM, Matt Turner wrote: > On Tue, Oct 18, 2016 at 5:20 PM, Ian Romanickwrote: >> On 10/11/2016 02:01 AM, Iago Toral Quiroga wrote: >>> --- >>> src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 22 ++ >>> 1 file changed, 18 insertions(+), 4 deletions(-) >>> >>> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp >>> b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp >>> index 05e7f29..ce95c8d 100644 >>> --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp >>> +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp >>> @@ -352,8 +352,15 @@ vec4_visitor::get_indirect_offset(nir_intrinsic_instr >>> *instr) >>> void >>> vec4_visitor::nir_emit_load_const(nir_load_const_instr *instr) >>> { >>> - dst_reg reg = dst_reg(VGRF, alloc.allocate(1)); >>> - reg.type = BRW_REGISTER_TYPE_D; >>> + dst_reg reg; >>> + >>> + if (instr->def.bit_size == 64) { >>> + reg = dst_reg(VGRF, alloc.allocate(2)); >>> + reg.type = BRW_REGISTER_TYPE_DF; >> >> For 32-bits we use an integer type (D). Should was also use an integer >> type (Q) here? I'm worried that I'll have problems with this when I add >> int64 support. > > Q only exists on Broadwell and newer, so I don't think it's usable > here (at least for HSW/IVB). Ah yes. Good call. This patch is Reviewed-by: Ian Romanick ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 011/103] i965: fix subnr overflow in suboffset()
Reviewed-by: Ian RomanickIn the interest in reducing the number of patches in flight, I think this could land ahead of the others. On 10/11/2016 02:01 AM, Iago Toral Quiroga wrote: > --- > src/mesa/drivers/dri/i965/brw_reg.h | 13 + > 1 file changed, 5 insertions(+), 8 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_reg.h > b/src/mesa/drivers/dri/i965/brw_reg.h > index 3b46d27..8907c9c 100644 > --- a/src/mesa/drivers/dri/i965/brw_reg.h > +++ b/src/mesa/drivers/dri/i965/brw_reg.h > @@ -520,14 +520,6 @@ sechalf(struct brw_reg reg) > } > > static inline struct brw_reg > -suboffset(struct brw_reg reg, unsigned delta) > -{ > - reg.subnr += delta * type_sz(reg.type); > - return reg; > -} > - > - > -static inline struct brw_reg > offset(struct brw_reg reg, unsigned delta) > { > reg.nr += delta; > @@ -544,6 +536,11 @@ byte_offset(struct brw_reg reg, unsigned bytes) > return reg; > } > > +static inline struct brw_reg > +suboffset(struct brw_reg reg, unsigned delta) > +{ > + return byte_offset(reg, delta * type_sz(reg.type)); > +} > > /** Construct unsigned word[16] register */ > static inline struct brw_reg > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 007/103] i965/vec4/nir: fix emitting 64-bit immediates
On Tue, Oct 18, 2016 at 5:20 PM, Ian Romanickwrote: > On 10/11/2016 02:01 AM, Iago Toral Quiroga wrote: >> --- >> src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 22 ++ >> 1 file changed, 18 insertions(+), 4 deletions(-) >> >> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp >> b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp >> index 05e7f29..ce95c8d 100644 >> --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp >> +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp >> @@ -352,8 +352,15 @@ vec4_visitor::get_indirect_offset(nir_intrinsic_instr >> *instr) >> void >> vec4_visitor::nir_emit_load_const(nir_load_const_instr *instr) >> { >> - dst_reg reg = dst_reg(VGRF, alloc.allocate(1)); >> - reg.type = BRW_REGISTER_TYPE_D; >> + dst_reg reg; >> + >> + if (instr->def.bit_size == 64) { >> + reg = dst_reg(VGRF, alloc.allocate(2)); >> + reg.type = BRW_REGISTER_TYPE_DF; > > For 32-bits we use an integer type (D). Should was also use an integer > type (Q) here? I'm worried that I'll have problems with this when I add > int64 support. Q only exists on Broadwell and newer, so I don't think it's usable here (at least for HSW/IVB). ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 010/103] i965/vec4: translate d2f/f2d
Reviewed-by: Ian RomanickOn 10/11/2016 02:01 AM, Iago Toral Quiroga wrote: > --- > src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 24 > 1 file changed, 24 insertions(+) > > diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp > b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp > index ce95c8d..b75337c 100644 > --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp > +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp > @@ -,6 +,30 @@ vec4_visitor::nir_emit_alu(nir_alu_instr *instr) >inst = emit(MOV(dst, op[0])); >break; > > + case nir_op_d2f: { > + dst_reg temp = dst_reg(this, glsl_type::dvec4_type); > + emit(MOV(temp, op[0])); > + > + dst_reg temp2 = dst_reg(this, glsl_type::dvec4_type); > + temp2 = retype(temp2, BRW_REGISTER_TYPE_F); > + emit(VEC4_OPCODE_DOUBLE_TO_FLOAT, temp2, src_reg(temp)) > + ->size_written = 2 * REG_SIZE; > + > + vec4_instruction *inst = emit(MOV(dst, src_reg(temp2))); > + inst->saturate = instr->dest.saturate; > + break; > + } > + > + case nir_op_f2d: { > + dst_reg tmp_dst = dst_reg(src_reg(this, glsl_type::dvec4_type)); > + src_reg tmp_src = src_reg(this, glsl_type::vec4_type); > + emit(MOV(dst_reg(tmp_src), retype(op[0], BRW_REGISTER_TYPE_F))); > + emit(VEC4_OPCODE_FLOAT_TO_DOUBLE, tmp_dst, tmp_src); > + vec4_instruction *inst = emit(MOV(dst, src_reg(tmp_dst))); > + inst->saturate = instr->dest.saturate; > + break; > + } > + > case nir_op_fadd: >/* fall through */ > case nir_op_iadd: > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 009/103] i965/vec4: add double/float conversion pseudo-opcodes
Based on my (fairly weak) understanding of vstrides, this patch is Reviewed-by: Ian RomanickOn 10/11/2016 02:01 AM, Iago Toral Quiroga wrote: > These need to be emitted as align1 MOV's, since they need to have a > stride of 2 on the float register (whether src or dest) so that data > from another thread doesn't cross the middle of a SIMD8 register. > > v2 (Iago): > - The float-to-double needs to align 32-bit data to 64-bit before doing the > conversion. This was doable in align16 when we tried to use an execsize > of 4, but with an execsize of 8 we would need another align1 opcode to do > that (since we need data to cross the middle of a SIMD register). Just > making the opcode handle this internally seems more practical that adding > another opcode just for this purpose and having the caller know about this > before converting. > - The double-to-float conversion produces 32-bit elements aligned to 64-bit > so we make the opcode re-pack the result to 32-bit and fit in one register, > as expected by SIMD4x2 operation. This still requires that callers reserve > two registers for the float data destination because we need to produce > 64-bit aligned data first, and repack it later on the same destination > register, but it saves the need for a re-pack opcode only to achieve this > making the operation complete in a single opcode. Hopefully that is worth > the weirdness of the double register allocation... > > Signed-off-by: Connor Abbott > Signed-off-by: Iago Toral Quiroga > --- > src/mesa/drivers/dri/i965/brw_defines.h | 2 ++ > src/mesa/drivers/dri/i965/brw_shader.cpp | 4 +++ > src/mesa/drivers/dri/i965/brw_vec4.cpp | 8 + > src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 44 > > 4 files changed, 58 insertions(+) > > diff --git a/src/mesa/drivers/dri/i965/brw_defines.h > b/src/mesa/drivers/dri/i965/brw_defines.h > index c4e0f27..79b96a4 100644 > --- a/src/mesa/drivers/dri/i965/brw_defines.h > +++ b/src/mesa/drivers/dri/i965/brw_defines.h > @@ -1098,6 +1098,8 @@ enum opcode { > VEC4_OPCODE_MOV_BYTES, > VEC4_OPCODE_PACK_BYTES, > VEC4_OPCODE_UNPACK_UNIFORM, > + VEC4_OPCODE_DOUBLE_TO_FLOAT, > + VEC4_OPCODE_FLOAT_TO_DOUBLE, > > FS_OPCODE_DDX_COARSE, > FS_OPCODE_DDX_FINE, > diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp > b/src/mesa/drivers/dri/i965/brw_shader.cpp > index ed81563..b063f77 100644 > --- a/src/mesa/drivers/dri/i965/brw_shader.cpp > +++ b/src/mesa/drivers/dri/i965/brw_shader.cpp > @@ -317,6 +317,10 @@ brw_instruction_name(const struct gen_device_info > *devinfo, enum opcode op) >return "pack_bytes"; > case VEC4_OPCODE_UNPACK_UNIFORM: >return "unpack_uniform"; > + case VEC4_OPCODE_DOUBLE_TO_FLOAT: > + return "double_to_float"; > + case VEC4_OPCODE_FLOAT_TO_DOUBLE: > + return "float_to_double"; > > case FS_OPCODE_DDX_COARSE: >return "ddx_coarse"; > diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp > b/src/mesa/drivers/dri/i965/brw_vec4.cpp > index c29cfb5..32c04b2 100644 > --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp > +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp > @@ -253,6 +253,8 @@ vec4_instruction::can_do_writemask(const struct > gen_device_info *devinfo) > { > switch (opcode) { > case SHADER_OPCODE_GEN4_SCRATCH_READ: > + case VEC4_OPCODE_DOUBLE_TO_FLOAT: > + case VEC4_OPCODE_FLOAT_TO_DOUBLE: > case VS_OPCODE_PULL_CONSTANT_LOAD: > case VS_OPCODE_PULL_CONSTANT_LOAD_GEN7: > case VS_OPCODE_SET_SIMD4X2_HEADER_GEN9: > @@ -505,6 +507,12 @@ vec4_visitor::opt_reduce_swizzle() >case BRW_OPCODE_DP2: > swizzle = brw_swizzle_for_size(2); > break; > + > + case VEC4_OPCODE_FLOAT_TO_DOUBLE: > + case VEC4_OPCODE_DOUBLE_TO_FLOAT: > + swizzle = brw_swizzle_for_size(4); > + break; > + >default: > swizzle = brw_swizzle_for_mask(inst->dst.writemask); > break; > diff --git a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp > b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp > index 163cf9d..6f4c438 100644 > --- a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp > +++ b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp > @@ -1896,6 +1896,50 @@ generate_code(struct brw_codegen *p, > break; >} > > + case VEC4_OPCODE_DOUBLE_TO_FLOAT: { > + assert(src[0].type == BRW_REGISTER_TYPE_DF); > + assert(dst.type == BRW_REGISTER_TYPE_F); > + > + brw_set_default_access_mode(p, BRW_ALIGN_1); > + > + dst.hstride = BRW_HORIZONTAL_STRIDE_2; > + dst.width = BRW_WIDTH_4; > + src[0].vstride = BRW_VERTICAL_STRIDE_4; > + src[0].width = BRW_WIDTH_4; > + brw_MOV(p, dst, src[0]); > + > + struct brw_reg dst_as_src = dst; > + dst.hstride = BRW_HORIZONTAL_STRIDE_1; > + dst.width = BRW_WIDTH_8; > +
Re: [Mesa-dev] Mesa (master): glsl: Immediately inline built-ins rather than generating calls.
On 10/18/2016 05:50 PM, Kenneth Graunke wrote: On Tuesday, October 18, 2016 4:38:17 PM PDT Brian Paul wrote: Hi Ken, I found that this patch causes a regression. There's a Windows medical app which fails to link some shaders since this change. Basically, when the gl_Position VS input is declared as invariant the linker fails with: error: declarations for uniform `gl_ModelViewProjectionMatrix' have mismatching invariant qualifiers I haven't investigated how to fix this. I'm hoping you can see a simple fix. The attached piglit shader_runner script demonstrates the issue. Passes w/ NVIDIA. Thanks! -Brian Oh, sorry about that! Here are two possible fixes: https://cgit.freedesktop.org/~kwg/mesa/commit/?h=invariant-fix https://cgit.freedesktop.org/~kwg/mesa/commit/?h=invariant-fix-2 They're both kind of hacks...but the whole invariant propagation pass is kind of a hack, and we've got some other hacks in place already. So...maybe best to pile another one on. Not sure which though. Maybe Jason or Curro will have an opinion... Thanks! Either patch is OK with me (though, I'd suggest putting a comment on the first one to explain what's happening). It'd be great if we can commit one or the other in the next day or so. -Brian ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 007/103] i965/vec4/nir: fix emitting 64-bit immediates
On 10/11/2016 02:01 AM, Iago Toral Quiroga wrote: > --- > src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 22 ++ > 1 file changed, 18 insertions(+), 4 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp > b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp > index 05e7f29..ce95c8d 100644 > --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp > +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp > @@ -352,8 +352,15 @@ vec4_visitor::get_indirect_offset(nir_intrinsic_instr > *instr) > void > vec4_visitor::nir_emit_load_const(nir_load_const_instr *instr) > { > - dst_reg reg = dst_reg(VGRF, alloc.allocate(1)); > - reg.type = BRW_REGISTER_TYPE_D; > + dst_reg reg; > + > + if (instr->def.bit_size == 64) { > + reg = dst_reg(VGRF, alloc.allocate(2)); > + reg.type = BRW_REGISTER_TYPE_DF; For 32-bits we use an integer type (D). Should was also use an integer type (Q) here? I'm worried that I'll have problems with this when I add int64 support. > + } else { > + reg = dst_reg(VGRF, alloc.allocate(1)); > + reg.type = BRW_REGISTER_TYPE_D; > + } > > unsigned remaining = brw_writemask_for_size(instr->def.num_components); > > @@ -368,13 +375,20 @@ vec4_visitor::nir_emit_load_const(nir_load_const_instr > *instr) > continue; > >for (unsigned j = i; j < instr->def.num_components; j++) { > - if (instr->value.u32[i] == instr->value.u32[j]) { > + if ((instr->def.bit_size == 32 && > + instr->value.u32[i] == instr->value.u32[j]) || > + (instr->def.bit_size == 64 && > + instr->value.f64[i] == instr->value.f64[j])) { > writemask |= 1 << j; > } >} > >reg.writemask = writemask; > - emit(MOV(reg, brw_imm_d(instr->value.i32[i]))); > + if (instr->def.bit_size == 64) { > + emit(MOV(reg, brw_imm_df(instr->value.f64[i]))); > + } else { > + emit(MOV(reg, brw_imm_d(instr->value.i32[i]))); > + } > >remaining &= ~writemask; > } > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 004/103] i965/vec4/nir: Add bit-size information to types
On 10/11/2016 02:01 AM, Iago Toral Quiroga wrote: > Reviewed-by: Francisco Jerez> --- > src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 8 > 1 file changed, 4 insertions(+), 4 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp > b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp > index af76730..5048c4e 100644 > --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp > +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp > @@ -325,7 +325,7 @@ src_reg > vec4_visitor::get_nir_src(const nir_src , unsigned num_components) > { > /* if type is not specified, default to signed int */ > - return get_nir_src(src, nir_type_int, num_components); > + return get_nir_src(src, nir_type_int32, num_components); > } > > src_reg > @@ -747,7 +747,7 @@ vec4_visitor::nir_emit_intrinsic(nir_intrinsic_instr > *instr) >const nir_intrinsic_info *info = > _intrinsic_infos[instr->intrinsic]; > >/* Get the arguments of the atomic intrinsic. */ > - src_reg offset = get_nir_src(instr->src[0], nir_type_int, > + src_reg offset = get_nir_src(instr->src[0], nir_type_int32, > instr->num_components); >const src_reg surface = brw_imm_ud(surf_index); >const src_reg src0 = (info->num_srcs >= 2 > @@ -793,7 +793,7 @@ vec4_visitor::nir_emit_intrinsic(nir_intrinsic_instr > *instr) >* from any live channel. >*/ > surf_index = src_reg(this, glsl_type::uint_type); > - emit(ADD(dst_reg(surf_index), get_nir_src(instr->src[0], > nir_type_int, > + emit(ADD(dst_reg(surf_index), get_nir_src(instr->src[0], > nir_type_int32, > instr->num_components), >brw_imm_ud(prog_data->base.binding_table.ubo_start))); > surf_index = emit_uniformize(surf_index); > @@ -811,7 +811,7 @@ vec4_visitor::nir_emit_intrinsic(nir_intrinsic_instr > *instr) >if (const_offset) { > offset = brw_imm_ud(const_offset->u32[0] & ~15); >} else { > - offset = get_nir_src(instr->src[1], nir_type_int, 1); > + offset = get_nir_src(instr->src[1], nir_type_uint32, 1); Does it matter that this changed form int to uint32? >} > >src_reg packed_consts = src_reg(this, glsl_type::vec4_type); > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 003/103] i965/vec4/nir: allocate two registers for dvec3/dvec4
On 10/11/2016 02:01 AM, Iago Toral Quiroga wrote: > From: Connor Abbott> > v2 (Curro): > - Do not special-case for a bit-size of 64, divide the bit_size by 32 > instead. > - Use DIV_ROUND_UP so we can handle sub-32-bit types. > --- > src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 7 --- > 1 file changed, 4 insertions(+), 3 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp > b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp > index ddeff2d..af76730 100644 > --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp > +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp > @@ -140,8 +140,8 @@ vec4_visitor::nir_emit_impl(nir_function_impl *impl) > foreach_list_typed(nir_register, reg, node, >registers) { >unsigned array_elems = > reg->num_array_elems == 0 ? 1 : reg->num_array_elems; > - > - nir_locals[reg->index] = dst_reg(VGRF, alloc.allocate(array_elems)); > + unsigned num_regs = array_elems * DIV_ROUND_UP(reg->bit_size, 32); constify, and this patch is Reviewed-by: Ian Romanick > + nir_locals[reg->index] = dst_reg(VGRF, alloc.allocate(num_regs)); > } > > nir_ssa_values = ralloc_array(mem_ctx, dst_reg, impl->ssa_alloc); > @@ -270,7 +270,8 @@ dst_reg > vec4_visitor::get_nir_dest(const nir_dest ) > { > if (dest.is_ssa) { > - dst_reg dst = dst_reg(VGRF, alloc.allocate(1)); > + dst_reg dst = > + dst_reg(VGRF, alloc.allocate(DIV_ROUND_UP(dest.ssa.bit_size, 32))); >nir_ssa_values[dest.ssa.index] = dst; >return dst; > } else { > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 103/103] i965/gen7: expose OpenGL 4.0 on Haswell
On 10/11/2016 02:02 AM, Iago Toral Quiroga wrote: > ARB_gpu_shader_fp64 was the last piece missing. Notice that some > hardware and kernel combinations do not support pipelined register > writes, which are required for some OpenGL 4.0 features, in which > case the driver won't expose 4.0. > --- > src/mesa/drivers/dri/i965/intel_extensions.c | 2 ++ > src/mesa/drivers/dri/i965/intel_screen.c | 2 +- > 2 files changed, 3 insertions(+), 1 deletion(-) > > diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c > b/src/mesa/drivers/dri/i965/intel_extensions.c > index 0491145..a291cd5 100644 > --- a/src/mesa/drivers/dri/i965/intel_extensions.c > +++ b/src/mesa/drivers/dri/i965/intel_extensions.c > @@ -272,6 +272,8 @@ intelInitExtensions(struct gl_context *ctx) > > if (brw->gen >= 8) >ctx->Const.GLSLVersion = 440; > + else if (brw->is_haswell) > + ctx->Const.GLSLVersion = 400; > else if (brw->gen >= 6) >ctx->Const.GLSLVersion = 330; > else > diff --git a/src/mesa/drivers/dri/i965/intel_screen.c > b/src/mesa/drivers/dri/i965/intel_screen.c > index 9b23bac..1af7fe6 100644 > --- a/src/mesa/drivers/dri/i965/intel_screen.c > +++ b/src/mesa/drivers/dri/i965/intel_screen.c > @@ -1445,7 +1445,7 @@ set_max_gl_versions(struct intel_screen *screen) >dri_screen->max_gl_es2_version = has_astc ? 32 : 31; >break; > case 7: > - dri_screen->max_gl_core_version = 33; > + dri_screen->max_gl_core_version = screen->devinfo.is_haswell ? 40 : 33; I *think* this needs to take the pipelined register writes into consideration. My understanding is if you say 40 here, then glXCreateContextAttribs will allow creation of an OpenGL 4.0 context... but the context may only be 3.3. >dri_screen->max_gl_compat_version = 30; >dri_screen->max_gl_es1_version = 11; >dri_screen->max_gl_es2_version = screen->devinfo.is_haswell ? 31 : 30; > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 20/22] anv: move to using shared wsi code
On 19 October 2016 at 04:26, Emil Velikovwrote: > Hi Dave, > > Thanks for doing this. It'll be great to get an Ack from the Intel > devs, on the idea. > > Afaics with 22/22 in place you can drop the vk_alloc2/vk_free2 > functions since they are no longer used. No they are still used in the anv/radv code, just not in the wsi code. >> src/mesa/main/tests/Makefile >> src/util/Makefile >> src/util/tests/hash_table/Makefile >> - src/vulkan/Makefile]) >> + src/vulkan/Makefile >> + src/vulkan/wsi/Makefile]) >> > Just fold the new Makefile into the existing one ? In should be as > simple as adding wsi/ prefix to files. > Alternatively we can do that as a follow-up. Actually we ended up not needing src/vulkan, so this ends up at one line now. Dave. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] vulkan/wsi: move some things into wsi_device.
From: Dave AirlieThis copies the allocator callbacks, along with normal callbacks and physical device into the wsi device. I'm a bit 50/50 on whether this makes things cleaner so far --- src/amd/vulkan/radv_wsi.c | 17 + src/amd/vulkan/radv_wsi_x11.c | 2 -- src/intel/vulkan/anv_wsi.c | 17 + src/intel/vulkan/anv_wsi_x11.c | 2 -- src/vulkan/wsi/wsi_common.h | 28 +--- src/vulkan/wsi/wsi_common_wayland.c | 32 +--- src/vulkan/wsi/wsi_common_x11.c | 23 +-- src/vulkan/wsi/wsi_common_x11.h | 1 - 8 files changed, 53 insertions(+), 69 deletions(-) diff --git a/src/amd/vulkan/radv_wsi.c b/src/amd/vulkan/radv_wsi.c index 3c3abe9..56eacc5 100644 --- a/src/amd/vulkan/radv_wsi.c +++ b/src/amd/vulkan/radv_wsi.c @@ -37,19 +37,21 @@ radv_init_wsi(struct radv_physical_device *physical_device) memset(physical_device->wsi_device.wsi, 0, sizeof(physical_device->wsi_device.wsi)); +physical_device->wsi_device.alloc = physical_device->instance->alloc; +physical_device->wsi_device.physical_device = anv_physical_device_to_handle(physical_device); +physical_device->wsi_device.cbs = _cbs; + #ifdef VK_USE_PLATFORM_XCB_KHR - result = wsi_x11_init_wsi(_device->wsi_device, _device->instance->alloc); + result = wsi_x11_init_wsi(_device->wsi_device); if (result != VK_SUCCESS) return result; #endif #ifdef VK_USE_PLATFORM_WAYLAND_KHR - result = wsi_wl_init_wsi(_device->wsi_device, _device->instance->alloc, - radv_physical_device_to_handle(physical_device), -_cbs); + result = wsi_wl_init_wsi(_device->wsi_device); if (result != VK_SUCCESS) { #ifdef VK_USE_PLATFORM_XCB_KHR - wsi_x11_finish_wsi(_device->wsi_device, _device->instance->alloc); +wsi_x11_finish_wsi(_device->wsi_device); #endif return result; } @@ -62,10 +64,10 @@ void radv_finish_wsi(struct radv_physical_device *physical_device) { #ifdef VK_USE_PLATFORM_WAYLAND_KHR - wsi_wl_finish_wsi(_device->wsi_device, _device->instance->alloc); + wsi_wl_finish_wsi(_device->wsi_device); #endif #ifdef VK_USE_PLATFORM_XCB_KHR - wsi_x11_finish_wsi(_device->wsi_device, _device->instance->alloc); + wsi_x11_finish_wsi(_device->wsi_device); #endif } @@ -91,7 +93,6 @@ VkResult radv_GetPhysicalDeviceSurfaceSupportKHR( struct wsi_interface *iface = device->wsi_device.wsi[surface->platform]; return iface->get_support(surface, >wsi_device, - >instance->alloc, queueFamilyIndex, pSupported); } diff --git a/src/amd/vulkan/radv_wsi_x11.c b/src/amd/vulkan/radv_wsi_x11.c index 946b990..66c9bbb 100644 --- a/src/amd/vulkan/radv_wsi_x11.c +++ b/src/amd/vulkan/radv_wsi_x11.c @@ -44,7 +44,6 @@ VkBool32 radv_GetPhysicalDeviceXcbPresentationSupportKHR( return wsi_get_physical_device_xcb_presentation_support( >wsi_device, - >instance->alloc, queueFamilyIndex, connection, visual_id); } @@ -58,7 +57,6 @@ VkBool32 radv_GetPhysicalDeviceXlibPresentationSupportKHR( return wsi_get_physical_device_xcb_presentation_support( >wsi_device, - >instance->alloc, queueFamilyIndex, XGetXCBConnection(dpy), visualID); } diff --git a/src/intel/vulkan/anv_wsi.c b/src/intel/vulkan/anv_wsi.c index f816735..3520300 100644 --- a/src/intel/vulkan/anv_wsi.c +++ b/src/intel/vulkan/anv_wsi.c @@ -36,19 +36,21 @@ anv_init_wsi(struct anv_physical_device *physical_device) memset(physical_device->wsi_device.wsi, 0, sizeof(physical_device->wsi_device.wsi)); + physical_device->wsi_device.alloc = physical_device->instance->alloc; + physical_device->wsi_device.physical_device = anv_physical_device_to_handle(physical_device); + physical_device->wsi_device.cbs = _cbs; + #ifdef VK_USE_PLATFORM_XCB_KHR - result = wsi_x11_init_wsi(_device->wsi_device, _device->instance->alloc); + result = wsi_x11_init_wsi(_device->wsi_device); if (result != VK_SUCCESS) return result; #endif #ifdef VK_USE_PLATFORM_WAYLAND_KHR - result = wsi_wl_init_wsi(_device->wsi_device, _device->instance->alloc, -anv_physical_device_to_handle(physical_device), -_cbs); + result = wsi_wl_init_wsi(_device->wsi_device); if (result != VK_SUCCESS) { #ifdef VK_USE_PLATFORM_XCB_KHR - wsi_x11_finish_wsi(_device->wsi_device, _device->instance->alloc); + wsi_x11_finish_wsi(_device->wsi_device); #endif return result; } @@ -61,10 +63,10 @@ void anv_finish_wsi(struct anv_physical_device *physical_device) { #ifdef VK_USE_PLATFORM_WAYLAND_KHR - wsi_wl_finish_wsi(_device->wsi_device,
[Mesa-dev] [PATCH 1/2] vulkan/wsi: use swapchain->alloc for destructors.
From: Dave AirlieAs Jason pointed out the app has to pass in the same thing, so just destroy using the one we copied earlier. Signed-off-by: Dave Airlie --- src/amd/vulkan/radv_wsi.c | 2 +- src/intel/vulkan/anv_wsi.c | 8 +--- src/vulkan/wsi/wsi_common.h | 4 ++-- src/vulkan/wsi/wsi_common_wayland.c | 10 +- src/vulkan/wsi/wsi_common_x11.c | 7 +++ 5 files changed, 12 insertions(+), 19 deletions(-) diff --git a/src/amd/vulkan/radv_wsi.c b/src/amd/vulkan/radv_wsi.c index ba5c37b..3c3abe9 100644 --- a/src/amd/vulkan/radv_wsi.c +++ b/src/amd/vulkan/radv_wsi.c @@ -291,7 +291,7 @@ void radv_DestroySwapchainKHR( radv_DestroyFence(device, swapchain->fences[i], pAllocator); } - swapchain->destroy(swapchain, pAllocator); + swapchain->destroy(swapchain); } VkResult radv_GetSwapchainImagesKHR( diff --git a/src/intel/vulkan/anv_wsi.c b/src/intel/vulkan/anv_wsi.c index 064581d..f816735 100644 --- a/src/intel/vulkan/anv_wsi.c +++ b/src/intel/vulkan/anv_wsi.c @@ -290,20 +290,14 @@ void anv_DestroySwapchainKHR( VkSwapchainKHR _swapchain, const VkAllocationCallbacks* pAllocator) { - ANV_FROM_HANDLE(anv_device, device, _device); ANV_FROM_HANDLE(wsi_swapchain, swapchain, _swapchain); - const VkAllocationCallbacks *alloc; - if (pAllocator) - alloc = pAllocator; - else - alloc = >alloc; for (unsigned i = 0; i < ARRAY_SIZE(swapchain->fences); i++) { if (swapchain->fences[i] != VK_NULL_HANDLE) anv_DestroyFence(_device, swapchain->fences[i], pAllocator); } - swapchain->destroy(swapchain, alloc); + swapchain->destroy(swapchain); } VkResult anv_GetSwapchainImagesKHR( diff --git a/src/vulkan/wsi/wsi_common.h b/src/vulkan/wsi/wsi_common.h index ee67511..1f4e0ae 100644 --- a/src/vulkan/wsi/wsi_common.h +++ b/src/vulkan/wsi/wsi_common.h @@ -54,8 +54,8 @@ struct wsi_swapchain { const struct wsi_image_fns *image_fns; VkFence fences[3]; - VkResult (*destroy)(struct wsi_swapchain *swapchain, - const VkAllocationCallbacks *pAllocator); + VkResult (*destroy)(struct wsi_swapchain *swapchain); + VkResult (*get_images)(struct wsi_swapchain *swapchain, uint32_t *pCount, VkImage *pSwapchainImages); VkResult (*acquire_next_image)(struct wsi_swapchain *swap_chain, diff --git a/src/vulkan/wsi/wsi_common_wayland.c b/src/vulkan/wsi/wsi_common_wayland.c index 32a0a51..ecb1ab5 100644 --- a/src/vulkan/wsi/wsi_common_wayland.c +++ b/src/vulkan/wsi/wsi_common_wayland.c @@ -647,19 +647,19 @@ wsi_wl_image_init(struct wsi_wl_swapchain *chain, } static VkResult -wsi_wl_swapchain_destroy(struct wsi_swapchain *wsi_chain, - const VkAllocationCallbacks *pAllocator) +wsi_wl_swapchain_destroy(struct wsi_swapchain *wsi_chain) { struct wsi_wl_swapchain *chain = (struct wsi_wl_swapchain *)wsi_chain; for (uint32_t i = 0; i < chain->image_count; i++) { if (chain->images[i].buffer) - chain->base.image_fns->free_wsi_image(chain->base.device, pAllocator, + chain->base.image_fns->free_wsi_image(chain->base.device, + >base.alloc, chain->images[i].image, chain->images[i].memory); } - vk_free(pAllocator, chain); + vk_free(>base.alloc, chain); return VK_SUCCESS; } @@ -747,7 +747,7 @@ wsi_wl_surface_create_swapchain(VkIcdSurfaceBase *icd_surface, return VK_SUCCESS; fail: - wsi_wl_swapchain_destroy(>base, pAllocator); + wsi_wl_swapchain_destroy(>base); return result; } diff --git a/src/vulkan/wsi/wsi_common_x11.c b/src/vulkan/wsi/wsi_common_x11.c index 241ef42..3bb8f35 100644 --- a/src/vulkan/wsi/wsi_common_x11.c +++ b/src/vulkan/wsi/wsi_common_x11.c @@ -706,16 +706,15 @@ x11_image_finish(struct x11_swapchain *chain, } static VkResult -x11_swapchain_destroy(struct wsi_swapchain *anv_chain, - const VkAllocationCallbacks *pAllocator) +x11_swapchain_destroy(struct wsi_swapchain *anv_chain) { struct x11_swapchain *chain = (struct x11_swapchain *)anv_chain; for (uint32_t i = 0; i < chain->image_count; i++) - x11_image_finish(chain, pAllocator, >images[i]); + x11_image_finish(chain, >base.alloc, >images[i]); xcb_unregister_for_special_event(chain->conn, chain->special_event); - vk_free(pAllocator, chain); + vk_free(>base.alloc, chain); return VK_SUCCESS; } -- 2.5.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [rfc] wsi device cleanups.
Jason, these should address the comments you made, I'm not sure these are a win over what was there, but I gave it a go. If you like them I've no objections. Dave. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] egl/dri2: add a libname to dlopen for OpenBSD
On Tue, Oct 18, 2016 at 04:24:20PM +0100, Emil Velikov wrote: > On 18 October 2016 at 00:58, Jonathan Graywrote: > > On Mon, Oct 17, 2016 at 05:34:02PM +0100, Emil Velikov wrote: > >> On 17 October 2016 at 16:39, Eric Engestrom > >> wrote: > >> > On Monday, 2016-10-17 22:53:20 +1100, Jonathan Gray wrote: > >> >> On Mon, Oct 17, 2016 at 12:39:11PM +0100, Emil Velikov wrote: > >> >> > On 17 October 2016 at 10:53, Eric Engestrom > >> >> > wrote: > >> >> > > On Sunday, 2016-10-16 16:38:35 +1100, Jonathan Gray wrote: > >> >> > >> On OpenBSD try to dlopen 'libglapi.so', ld.so will find > >> >> > >> the highest major/minor version and open it in this case. > >> >> > >> > >> >> > >> Avoids '#error Unknown glapi provider for this platform' at build > >> >> > >> time. > >> >> > >> > >> >> > >> Signed-off-by: Jonathan Gray > >> >> > > > >> >> > > LGTM, and I guess the other *BSD will want the same since 7a9c92d0 > >> >> > > broke > >> >> > > them too. > >> >> > > > >> >> > I'm not 100% sure about that. OpenBSD (unlike other BSD) did bump the > >> >> > major when the ABI breaks due to 'internal' changes - think of > >> >> > off_t/time_t on 32 vs 64bit systems and alike. > >> >> > > >> >> > Unlike Linux kernel/distros, BSDs tend to be more relaxed when in > >> >> > comes to ABI, I believe. Don't quote me on that one ;-) > >> >> > >> >> OpenBSD tends to favour simplified interfaces over backwards > >> >> compatiblity > >> >> and is more like a research system in that respect. As the kernel > >> >> and userland are one source tree ioctl compat largely doesn't exist. > >> >> System calls get deprecated and removed over the course of a few > >> >> releases. > >> >> So we didn't go through the pain of duplicated systems calls for off_t > >> >> as mentioned, and don't go in for symbol versioning. Just major.minor > >> >> library versioning, which is roughly symbol removals, major crank, > >> >> symbol additions minor crank. > >> >> > >> >> I believe FreeBSD tends to go in for backwards compatibility more > >> >> but am not familiar with the details. They also have a different ld.so. > >> >> > >> >> Perhaps an else case for 'libglapi.so.0' would be appropriate for all > >> >> the other various unices instead of the #error ? > >> > > >> > Yeah actually, I'm thinking reverting this hunk of 7a9c92d0 might be a > >> > better, > >> > to avoid the potentially huge list of every *BSD and other Unix: > >> > > >> Fwiw I've intentionally added the hunk since I was a bit lazy to check > >> if the BSD(s?)/Solaris/others have bumped the major locally. Having a > >> closer look that's not the case, so indeed we can add revert to > >> libglapi.so.0 in the else statement. > >> > >> Jonathan, how about we with the above instead ? > > > > At the moment OpenBSD has libglapi.so.0.2 for Mesa 11.2.2. > > New versions of Mesa add new shared_dispatch_stub_* symbols, > > which the minor would crank for. > > > Don't think we [intentionally] added any symbols for a long while. Comparing 11.2.2 libglapi and the latest Mesa I see: Dynamic export changes: added: shared_dispatch_stub_1323 shared_dispatch_stub_1324 shared_dispatch_stub_1325 shared_dispatch_stub_1326 shared_dispatch_stub_1327 shared_dispatch_stub_1328 shared_dispatch_stub_1329 Perhaps this is unique to the non-tls dispatch case though. > > > I'd prefer the diff I mailed for OpenBSD for if the major version > > should crank for some reason. > Let's worry about that if/when it happens ? sure > > Emil > /me lands the rest of the patches thankS ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] st/va: disable cabac for h264 baseline profile
Christian König wrote: Am 18.10.2016 um 15:42 schrieb Andy Furniss: Andy Furniss wrote: Christian König wrote: Am 18.10.2016 um 11:19 schrieb Andy Furniss: boyuan.zh...@amd.com wrote: From: Boyuan Zhangcabac is only supported in the h264 main and higher profiles So shouldn't there be code allows it if the user space doesn't set baseline? I don't know how in gstreamer as it seems to try to use b-frames if you use other than baseline which doesn't work. With avconv it is possible to call main/high and set b-frames to 0. I know it's technically correct spec wise, but seems a shame as it costs a fair bit in "free" efficiency. On Windows the raptor game recording app produces files flagged as high with cabac - but without b-frames. The problem is that it can easily break decoders. CABAC is simply not allowed in a stream flagged as baseline compliant. But with ffmpeg/avconv I can make a stream flagged as main/high even if it's really baseline + CABAC. I guess Windows may vary but the test I did seems to take this pragmatic approach, as it seems do other h/w encoders eg. smartphone output. It's a pity that we don't support B-frames any more. Anymore? Now I am curious, seems to work with omx (cqp single instance) With that in place we could easily advertise support for mainline profile. MBAFF/PAFF? Sorry if that came over as being pedantic, silly as I think pragmatism is the way to go and I know intel advertise main/high, but doubt they do interlaced. Exactly, I mean we are talking about features to support encoding into interlaced format. Is anybody still actively doing that? Well, broadcasters, but I guess "users" never did anyway. But even then, it's not so much of a problem advertising mainline profile and then not using MBAFF/PAFF. But when you advertise B-frames and then can't encode it you got a serious problem because your frames are not in the right order any more :) Yea, it's a shame - was there a reason that b-frame support was abandoned? In fact vce vaapi is currently advertising them as well (I did mention it in some thread). Good for letting ffmpeg flag as such while not using b-frames, not so good for gstreamer as they have changed the default to high so old command lines will not explicitly fail, but will produce junk. I see va.h has a cabac switch and gstreamer exposes it - though it's not read by the driver. Maybe if that were hooked up then users could turn it on and profit :-). Yeah, but again turning it on while the SPS/PPS only advertise the stream to be baseline compliant is a clear violation of the codec standard. (Is that actually encodeable in the stream? or does the encoder switch to some higher level automatically if you use it?). I was wrong about va.h having a switch, it was something different. gst-inspect-1.0 vaapih264enc shows it has one - I don't what if anything it actually does. I am not trying to say here that flagging as constrained baseline and using CABAC is in any way correct/legal. It just seems a shame to loose CABAC = 10-20% less bitrate "for ever". Looking at phone vids and vce from windows they get to use CABAC while not using b-frames by flagging as main/high. It would be good if there were a way to allow Linux vce to be the same. I don't know how - if you can only advertise baseline, other than user apps quirking their behavior depending on driver name, which I guess some do any way. Regards, Christian. Christian. Signed-off-by: Boyuan Zhang --- src/gallium/state_trackers/va/picture.c | 1 - 1 file changed, 1 deletion(-) diff --git a/src/gallium/state_trackers/va/picture.c b/src/gallium/state_trackers/va/picture.c index eae5dc4..db08a3c 100644 --- a/src/gallium/state_trackers/va/picture.c +++ b/src/gallium/state_trackers/va/picture.c @@ -110,7 +110,6 @@ getEncParamPreset(vlVaContext *context) context->desc.h264enc.motion_est.enc_ime2_search_range_y = 0x0004; //pic control preset - context->desc.h264enc.pic_ctrl.enc_cabac_enable = 0x0001; context->desc.h264enc.pic_ctrl.enc_constraint_set_flags = 0x0040; //rate control ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] mesa: remove unused LocalSizeVariable
Cc: Samuel PitoisetCc: Kenneth Graunke --- src/mesa/main/mtypes.h| 5 - src/mesa/main/shaderapi.c | 1 - 2 files changed, 6 deletions(-) diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h index ff20226..f4a9edd 100644 --- a/src/mesa/main/mtypes.h +++ b/src/mesa/main/mtypes.h @@ -2078,11 +2078,6 @@ struct gl_compute_program * Size of shared variables accessed by the compute shader. */ unsigned SharedSize; - - /** -* Whether a variable work group size has been specified. -*/ - bool LocalSizeVariable; }; diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c index c40bb2d..1af1c3f 100644 --- a/src/mesa/main/shaderapi.c +++ b/src/mesa/main/shaderapi.c @@ -2212,7 +2212,6 @@ _mesa_copy_linked_program_data(gl_shader_stage type, for (i = 0; i < 3; i++) dst_cp->LocalSize[i] = src->Comp.LocalSize[i]; dst_cp->SharedSize = src->Comp.SharedSize; - dst_cp->LocalSizeVariable = src->Comp.LocalSizeVariable; break; } default: -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Mesa (master): glsl: Immediately inline built-ins rather than generating calls.
On Tuesday, October 18, 2016 4:38:17 PM PDT Brian Paul wrote: > Hi Ken, > > I found that this patch causes a regression. There's a Windows medical > app which fails to link some shaders since this change. > > Basically, when the gl_Position VS input is declared as invariant the > linker fails with: > > error: declarations for uniform `gl_ModelViewProjectionMatrix' have > mismatching invariant qualifiers > > I haven't investigated how to fix this. I'm hoping you can see a simple > fix. > > The attached piglit shader_runner script demonstrates the issue. Passes > w/ NVIDIA. > > Thanks! > > -Brian Oh, sorry about that! Here are two possible fixes: https://cgit.freedesktop.org/~kwg/mesa/commit/?h=invariant-fix https://cgit.freedesktop.org/~kwg/mesa/commit/?h=invariant-fix-2 They're both kind of hacks...but the whole invariant propagation pass is kind of a hack, and we've got some other hacks in place already. So...maybe best to pile another one on. Not sure which though. Maybe Jason or Curro will have an opinion... --Ken signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 08/11] anv: move to using vk_alloc helpers.
On 19 October 2016 at 03:18, Emil Velikovwrote: > Hi Dave, > > On 17 October 2016 at 03:07, Dave Airlie wrote: >> From: Dave Airlie >> >> This moves all the alloc/free in anv to the generic helpers. >> >> Signed-off-by: Dave Airlie >> --- >> src/intel/vulkan/anv_batch_chain.c| 40 +++--- >> src/intel/vulkan/anv_cmd_buffer.c | 22 - >> src/intel/vulkan/anv_descriptor_set.c | 12 - >> src/intel/vulkan/anv_device.c | 26 ++-- >> src/intel/vulkan/anv_image.c | 14 +-- >> src/intel/vulkan/anv_intel.c | 4 +-- >> src/intel/vulkan/anv_pass.c | 10 >> src/intel/vulkan/anv_pipeline.c | 6 ++--- >> src/intel/vulkan/anv_pipeline_cache.c | 8 +++--- >> src/intel/vulkan/anv_private.h| 46 >> +-- >> src/intel/vulkan/anv_query.c | 6 ++--- >> src/intel/vulkan/anv_wsi.c| 2 +- >> src/intel/vulkan/anv_wsi_wayland.c| 16 ++-- >> src/intel/vulkan/anv_wsi_x11.c| 22 - >> src/intel/vulkan/gen7_pipeline.c | 4 +-- >> src/intel/vulkan/gen8_pipeline.c | 4 +-- >> src/intel/vulkan/genX_pipeline.c | 6 ++--- >> src/intel/vulkan/genX_state.c | 2 +- >> 18 files changed, 103 insertions(+), 147 deletions(-) >> > Wondering we one shouldn't include the new header only where needed ? > Quick grep shows 33 files which include anv_private.h of which (as per > above) ~half only need vk_alloc.h. Don't really see the benefit, splitting anv_private.h would be a bigger job I would think. Dave. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Mesa (master): glsl: Immediately inline built-ins rather than generating calls.
Hi Ken, I found that this patch causes a regression. There's a Windows medical app which fails to link some shaders since this change. Basically, when the gl_Position VS input is declared as invariant the linker fails with: error: declarations for uniform `gl_ModelViewProjectionMatrix' have mismatching invariant qualifiers I haven't investigated how to fix this. I'm hoping you can see a simple fix. The attached piglit shader_runner script demonstrates the issue. Passes w/ NVIDIA. Thanks! -Brian On 09/23/2016 05:45 PM, Kenneth Graunke wrote: Module: Mesa Branch: master Commit: b04ef3c08a288a5857349c9e582ee2718fa562f7 URL: https://urldefense.proofpoint.com/v2/url?u=http-3A__cgit.freedesktop.org_mesa_mesa_commit_-3Fid-3Db04ef3c08a288a5857349c9e582ee2718fa562f7=CwIGaQ=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs=T0t4QG7chq2ZwJo6wilkFznRSFy-8uDKartPGbomVj8=m7lwXZjH2_UAMD5u1FWrl6EmaAyly794Od4UBt09XC4=k1P13rDoBzgIU78tLDWd_Qo9GTSr_IX2GSRtdfMrDeI= Author: Kenneth GraunkeDate: Fri May 30 23:52:22 2014 -0700 glsl: Immediately inline built-ins rather than generating calls. In the past, we imported the prototypes of built-in functions, generated calls to those, and waited until link time to resolve the calls and import the actual code for the built-in functions. This severely limited our compile-time optimization opportunities: even trivial functions like dot() were represented as function calls. We also had no way of reasoning about those calls; they could have been 1,000 line functions with side-effects for all we knew. Practically all built-in functions are trivial translations to ir_expression opcodes, so it makes sense to just generate those inline. Since we eventually inline all functions anyway, we may as well just do it for all built-in functions. There's only one snag: built-in functions that refer to built-in global variables need those remapped to the variables in the shader being compiled, rather than the ones in the built-in shader. Currently, ftransform() is the only function matching those criteria, so it seemed easier to just make it a special case. On Skylake: total instructions in shared programs: 12023491 -> 12024010 (0.00%) instructions in affected programs: 77595 -> 78114 (0.67%) helped: 97 HURT: 309 total cycles in shared programs: 137239044 -> 137295498 (0.04%) cycles in affected programs: 16714026 -> 16770480 (0.34%) helped: 4663 HURT: 4923 while these statistics are in the wrong direction, the number of hurt programs is small (309 / 41282 = 0.75%), and I don't think anything can be done about it. A change like this significantly alters the order in which optimizations are performed. Signed-off-by: Kenneth Graunke Reviewed-by; Ian Romanick --- src/compiler/glsl/ast_function.cpp | 46 ++ 1 file changed, 22 insertions(+), 24 deletions(-) diff --git a/src/compiler/glsl/ast_function.cpp b/src/compiler/glsl/ast_function.cpp index 7e62ab7..ac3b52d 100644 --- a/src/compiler/glsl/ast_function.cpp +++ b/src/compiler/glsl/ast_function.cpp @@ -430,7 +430,8 @@ generate_call(exec_list *instructions, ir_function_signature *sig, exec_list *actual_parameters, ir_variable *sub_var, ir_rvalue *array_idx, - struct _mesa_glsl_parse_state *state) + struct _mesa_glsl_parse_state *state, + bool inline_immediately) { void *ctx = state; exec_list post_call_conversions; @@ -542,6 +543,10 @@ generate_call(exec_list *instructions, ir_function_signature *sig, ir_call *call = new(ctx) ir_call(sig, deref, actual_parameters, sub_var, array_idx); instructions->push_tail(call); + if (inline_immediately) { + call->generate_inline(call); + call->remove(); + } /* Also emit any necessary out-parameter conversions. */ instructions->append_list(_call_conversions); @@ -557,19 +562,18 @@ match_function_by_name(const char *name, exec_list *actual_parameters, struct _mesa_glsl_parse_state *state) { - void *ctx = state; ir_function *f = state->symbols->get_function(name); ir_function_signature *local_sig = NULL; ir_function_signature *sig = NULL; /* Is the function hidden by a record type constructor? */ if (state->symbols->get_type(name)) - goto done; /* no match */ + return sig; /* no match */ /* Is the function hidden by a variable (impossible in 1.10)? */ if (!state->symbols->separate_function_namespace && state->symbols->get_variable(name)) - goto done; /* no match */ + return sig; /* no match */ if (f != NULL) { /* In desktop GL, the presence of a user-defined signature hides any @@ -583,31 +587,15 @@ match_function_by_name(const char *name, sig = local_sig =
Re: [Mesa-dev] [PATCH 1/4] configure.ac: print whether GBM is enabled
On Wednesday, 2016-10-19 00:00:02 +0200, Marek Olšák wrote: > From: Marek OlšákSeries is: Reviewed-by: Eric Engestrom > > --- > configure.ac | 5 + > 1 file changed, 5 insertions(+) > > diff --git a/configure.ac b/configure.ac > index 8e779d4..bc9b732 100644 > --- a/configure.ac > +++ b/configure.ac > @@ -2860,20 +2860,25 @@ if test "$enable_egl" = yes; then > egl_drivers="" > if test "x$HAVE_EGL_DRIVER_DRI2" != "x"; then > egl_drivers="$egl_drivers builtin:egl_dri2" > fi > if test "x$HAVE_EGL_DRIVER_DRI3" != "x"; then > egl_drivers="$egl_drivers builtin:egl_dri3" > fi > > echo "EGL drivers:$egl_drivers" > fi > +if test "x$enable_gbm" = xyes; then > +echo "GBM: yes" > +else > +echo "GBM: no" > +fi > > # Vulkan > echo "" > if test "x$VULKAN_DRIVERS" != x; then > echo "Vulkan drivers: $VULKAN_DRIVERS" > echo "Vulkan ICD dir: $VULKAN_ICD_INSTALL_DIR" > else > echo "Vulkan drivers: no" > fi > > -- > 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] glsl: optimize list handling in opt_dead_code
On 10/18/2016 10:12 AM, Jan Ziak wrote: >> Regarding C++ templates, the compiler doesn't use them. If u_vector >> (Dave Airlie?) provides the same functionality as your array, I >> suggest we use u_vector instead. > > Let me repeat what you just wrote, because it is unbelievable: You are > advising the use of non-templated collection types in C++ code. Are you able to find any templates anywhere in the GLSL compiler? I don't think his statement was ambiguous. >> If you can't use u_vector, you should >> ask for approval from GLSL compiler leads (e.g. Ian Romanick or >> Kenneth Graunke) to use C++ templates. > > - You are talking about coding rules some Mesa developers agreed upon > and didn't bother writing down for other developers to read It was mostly written down, but it's not documented in the code base. It seems impossible to even get current, de facto practices documented. It's one of the few things in Mesa that really does get bike shedded. Before the current GLSL compiler, there was no C++ in Mesa at all. While developing the compiler, I found that I was re-implementing numerous C++ features by hand in C. It felt pretty insane. Why am I filling out all of these virtual function tables by hand? At the same time, I also observed that almost 100% of shipping, production-quality compilers were implemented using C++. The single exception was GCC. The need for GCC to bootstrap on minimal, sometimes dire, C compilers was the one thing keeping C++ out of the GCC code base. It wasn't even that long ago that core parts of GCC had to support pre-C89 compilers. As far as I am aware, they have since started using C++ too. Who am I to be so bold as to declare that everyone shipping a C compiler is wrong? In light of that, I opened a discussion about using C++ in the compiler. Especially at that time (2008-ish), nobody working on Mesa was particularly skilled at C++. I had used it some, and, in the mid-90's, had some really, really bad experiences with the implementations and side-effects of various language features. I still have nightmares about trying to use templates in GCC 2.4.2. There are quite a few C++ features that are really easy to misuse. There are also a lot of subtleties in the language that very few people really understand. I don't mean this in a pejorative way, but there was and continues to be a lot of FUD around C++. I think a lot of this comes from the "Old Woman Who Swallowed a Fly" nature of solving C++ development problems. You have a problem. The only way to solve that problem is to use another language feature that you may or may not understand how to use safely. You use that feature to solve your problem. Use of that feature presents a new problem. The only way to solve the new problem is to use yet another language feature that you may or may not understand how to use safely. Pretty soon nobody knows how anything in the code works. After quite a bit of discussion on the mesa-dev list, on #dri-devel, and face-to-face at XDC, we decided to use C++ with some restrictions. The main restriction was that C++ would be limited to the GLSL compiler stack. The other restrictions were roughly similar to the embedded C++ subset. - No exceptions. - No RTTI. - No multiple inheritance. - No operator overloading. It could be argued that our use of placement new deviates from this. In the previous metaphor, I think this was either the spider or the bird. - No templates. There are other restrictions (e.g., no STL) that come as natural consequences of these. Our goal was that any existing Mesa developer should be able to read any piece of new C++ code and know what it was doing. I feel like, due to our collective ignorance about the language, we may have been slightly too restrictive. It seems like we could have used templates in some very, very restricted ways to enable things like iterators that would have saved typing, encouraged refactoring, and made the code more understandable. Instead we have a proliferation of foreach macros (or callbacks), and every data structure is a linked list. It's difficult to say whether it would have made things strictly better or led us to swallow a bird, a cat, a dog... I also feel like that ship has sailed. When NIR was implemented using pure C, going so far as to re-invent constructors using macros, the chances of using more C++ faded substantially. If, and that's a really, really big if, additional C++ were to be used, it would have to be preceded by patches to docs/devinfo.html that documented: - What features were to be used. - Why use of those features benefit the code base. Specifically, why use of the new feature is substantially better than a different implementation that does not use the feature. - Any restrictions on the use of those features. Such a discussion may produce additional alternatives. > - I am not willing to use u_vector in C++ code
Re: [Mesa-dev] [PATCH] draw: improve vertex fetch (v2)
On 15/10/16 02:54, srol...@vmware.com wrote: From: Roland ScheideggerThe per-element fetch has quite some calculations which are constant, these can be moved outside both the per-element as well as the main shader loop (llvm can figure out it's constant mostly on its own, however this can have a significant compile time cost). Similarly, it looks easier swapping the fetch loops (outer loop per attrib, inner loop filling up the per vertex elements - this way the aos->soa conversion also can be done per attrib and not just at the end though again this doesn't really make much of a difference in the generated code). (This would also make it possible to vectorize the calculations leading to the fetches.) There's also some minimal change simplifying the overflow math slightly. All in all, the generated code seems to look slightly simpler (depending on the actual vs), but more importantly I've seen a significant reduction in compile times for some vs (albeit with old (3.3) llvm version, and the time reduction is only really for the optimizations run on the IR). v2: adapt to other draw change. No changes with piglit. --- src/gallium/auxiliary/draw/draw_llvm.c | 190 +++-- .../auxiliary/gallivm/lp_bld_arit_overflow.c | 24 +++ .../auxiliary/gallivm/lp_bld_arit_overflow.h | 6 + 3 files changed, 134 insertions(+), 86 deletions(-) diff --git a/src/gallium/auxiliary/draw/draw_llvm.c b/src/gallium/auxiliary/draw/draw_llvm.c index 3b56856..2f82d9d 100644 --- a/src/gallium/auxiliary/draw/draw_llvm.c +++ b/src/gallium/auxiliary/draw/draw_llvm.c @@ -659,85 +659,42 @@ generate_vs(struct draw_llvm_variant *variant, static void generate_fetch(struct gallivm_state *gallivm, struct draw_context *draw, - LLVMValueRef vbuffers_ptr, + const struct util_format_description *format_desc, + LLVMValueRef vb_stride, + LLVMValueRef stride_fixed, + LLVMValueRef map_ptr, + LLVMValueRef buffer_size_adj, + LLVMValueRef ofbit, LLVMValueRef *res, - struct pipe_vertex_element *velem, - LLVMValueRef vbuf, - LLVMValueRef index, - LLVMValueRef instance_id, - LLVMValueRef start_instance) + LLVMValueRef index) { - const struct util_format_description *format_desc = - util_format_description(velem->src_format); LLVMValueRef zero = LLVMConstNull(LLVMInt32TypeInContext(gallivm->context)); LLVMBuilderRef builder = gallivm->builder; - LLVMValueRef indices = - LLVMConstInt(LLVMInt64TypeInContext(gallivm->context), - velem->vertex_buffer_index, 0); - LLVMValueRef vbuffer_ptr = LLVMBuildGEP(builder, vbuffers_ptr, - , 1, ""); - LLVMValueRef vb_stride = draw_jit_vbuffer_stride(gallivm, vbuf); - LLVMValueRef vb_buffer_offset = draw_jit_vbuffer_offset(gallivm, vbuf); - LLVMValueRef map_ptr = draw_jit_dvbuffer_map(gallivm, vbuffer_ptr); - LLVMValueRef buffer_size = draw_jit_dvbuffer_size(gallivm, vbuffer_ptr); LLVMValueRef stride; LLVMValueRef buffer_overflowed; - LLVMValueRef needed_buffer_size; LLVMValueRef temp_ptr = lp_build_alloca(gallivm, lp_build_vec_type(gallivm, lp_float32_vec4_type()), ""); - LLVMValueRef ofbit = NULL; struct lp_build_if_state if_ctx; - if (velem->src_format == PIPE_FORMAT_NONE) { + if (format_desc->format == PIPE_FORMAT_NONE) { *res = lp_build_const_vec(gallivm, lp_float32_vec4_type(), 0); return; } - if (velem->instance_divisor) { - /* Index is equal to the start instance plus the number of current - * instance divided by the divisor. In this case we compute it as: - * index = start_instance + (instance_id / divisor) - */ - LLVMValueRef current_instance; - current_instance = LLVMBuildUDiv(builder, instance_id, - lp_build_const_int32(gallivm, velem->instance_divisor), - "instance_divisor"); - index = lp_build_uadd_overflow(gallivm, start_instance, - current_instance, ); - } - stride = lp_build_umul_overflow(gallivm, vb_stride, index, ); - stride = lp_build_uadd_overflow(gallivm, stride, vb_buffer_offset, ); - stride = lp_build_uadd_overflow( - gallivm, stride, - lp_build_const_int32(gallivm, velem->src_offset), ); - needed_buffer_size = lp_build_uadd_overflow( - gallivm, stride, - lp_build_const_int32(gallivm, - util_format_get_blocksize(velem->src_format)), - ); + stride = lp_build_uadd_overflow(gallivm, stride, stride_fixed, ); buffer_overflowed = LLVMBuildICmp(builder, LLVMIntUGT, - needed_buffer_size, buffer_size, +
Re: [Mesa-dev] [PATCH 1/6] util: add vector util code.
On 17 October 2016 at 18:09, Nicolai Hähnlewrote: > On 14.10.2016 05:16, Dave Airlie wrote: >> >> From: Dave Airlie >> >> This is ported from anv, both anv and radv can share this. >> >> Signed-off-by: Dave Airlie >> --- >> src/util/Makefile.sources | 4 +- >> src/util/u_vector.c | 98 >> +++ >> src/util/u_vector.h | 85 >> 3 files changed, 186 insertions(+), 1 deletion(-) >> create mode 100644 src/util/u_vector.c >> create mode 100644 src/util/u_vector.h >> >> diff --git a/src/util/Makefile.sources b/src/util/Makefile.sources >> index 8b17bcf..b7b1e91 100644 >> --- a/src/util/Makefile.sources >> +++ b/src/util/Makefile.sources >> @@ -35,7 +35,9 @@ MESA_UTIL_FILES :=\ >> strtod.h \ >> texcompress_rgtc_tmp.h \ >> u_atomic.h \ >> - u_endian.h >> + u_endian.h \ >> + u_vector.c \ >> + u_vector.h >> >> MESA_UTIL_GENERATED_FILES = \ >> format_srgb.c > > [snip] > >> diff --git a/src/util/u_vector.h b/src/util/u_vector.h >> new file mode 100644 >> index 000..ea52837 >> --- /dev/null >> +++ b/src/util/u_vector.h >> @@ -0,0 +1,85 @@ >> +/* >> + * Copyright © 2015 Intel Corporation >> + * >> + * Permission is hereby granted, free of charge, to any person obtaining >> a >> + * copy of this software and associated documentation files (the >> "Software"), >> + * to deal in the Software without restriction, including without >> limitation >> + * the rights to use, copy, modify, merge, publish, distribute, >> sublicense, >> + * and/or sell copies of the Software, and to permit persons to whom the >> + * Software is furnished to do so, subject to the following conditions: >> + * >> + * The above copyright notice and this permission notice (including the >> next >> + * paragraph) shall be included in all copies or substantial portions of >> the >> + * Software. >> + * >> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, >> EXPRESS OR >> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF >> MERCHANTABILITY, >> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT >> SHALL >> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR >> OTHER >> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, >> ARISING >> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER >> DEALINGS >> + * IN THE SOFTWARE. >> + */ >> +#ifndef U_VECTOR_H >> +#define U_VECTOR_H >> + >> +#include >> +#include >> +#include "util/u_math.h" >> +#include "util/macros.h" >> + >> +static inline uint32_t >> +u_align_u32(uint32_t v, uint32_t a) >> +{ >> + assert(a != 0 && a == (a & -a)); >> + return (v + a - 1) & ~(a - 1); >> +} > > > This fits better in u_math.h > Yes I realise this, and I'll probably move it there separately, but I'd like to start bringing u_math.h into src/util instead of pulling it from gallium in the future. I'll add a todo beside this function for now. Dave. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/4] configure.ac: enable GBM by default
From: Marek Olšák--- configure.ac | 19 +-- 1 file changed, 9 insertions(+), 10 deletions(-) diff --git a/configure.ac b/configure.ac index bc9b732..3431a5d 100644 --- a/configure.ac +++ b/configure.ac @@ -948,23 +948,30 @@ AC_ARG_ENABLE([egl], [enable_egl="$enableval"], [enable_egl=yes]) AC_ARG_ENABLE([xa], [AS_HELP_STRING([--enable-xa], [enable build of the XA X Acceleration API @<:@default=disabled@:>@])], [enable_xa="$enableval"], [enable_xa=no]) AC_ARG_ENABLE([gbm], [AS_HELP_STRING([--enable-gbm], - [enable gbm library @<:@default=auto@:>@])], + [enable gbm library @<:@default=yes except cygwin@:>@])], [enable_gbm="$enableval"], - [enable_gbm=auto]) + [case "$host_os" in + cygwin*) + enable_gbm=no + ;; + *) + enable_gbm=yes + ;; +esac]) AC_ARG_ENABLE([nine], [AS_HELP_STRING([--enable-nine], [enable build of the nine Direct3D9 API @<:@default=no@:>@])], [enable_nine="$enableval"], [enable_nine=no]) AC_ARG_ENABLE([xvmc], [AS_HELP_STRING([--enable-xvmc], [enable xvmc library @<:@default=auto@:>@])], [enable_xvmc="$enableval"], @@ -1748,28 +1755,20 @@ if test "x$enable_osmesa" = xyes -o "x$enable_gallium_osmesa" = xyes; then OSMESA_PC_LIB_PRIV="-lm $PTHREAD_LIBS $SELINUX_LIBS $DLOPEN_LIBS" fi AC_SUBST([OSMESA_LIB_DEPS]) AC_SUBST([OSMESA_PC_REQ]) AC_SUBST([OSMESA_PC_LIB_PRIV]) dnl dnl gbm configuration dnl -if test "x$enable_gbm" = xauto; then -case "$with_egl_platforms" in -*drm*) -enable_gbm=yes ;; - *) -enable_gbm=no ;; -esac -fi if test "x$enable_gbm" = xyes; then if test "x$enable_dri" = xyes; then if test "x$enable_shared_glapi" = xno; then AC_MSG_ERROR([gbm_dri requires --enable-shared-glapi]) fi else # Strictly speaking libgbm does not require --enable-dri, although # both of its backends do. Thus one can build libgbm without any # backends if --disable-dri is set. # To avoid unnecessary complexity of checking if at least one backend -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 4/4] configure.ac: check for Glamor requirements only when needed
From: Marek Olšák--- configure.ac | 37 +++-- 1 file changed, 27 insertions(+), 10 deletions(-) diff --git a/configure.ac b/configure.ac index 12c8165..17dfafd 100644 --- a/configure.ac +++ b/configure.ac @@ -2296,35 +2296,52 @@ dnl Gallium helper functions dnl gallium_require_llvm() { if test "x$MESA_LLVM" = x0; then case "$host" in *gnux32) return;; esac case "$host_cpu" in i*86|x86_64|amd64) AC_MSG_ERROR([LLVM is required to build $1 on x86 and x86_64]);; esac fi } -dnl This is for Glamor. Skip this if OpenGL is disabled. -require_egl_drm() { +dnl If EGL/X11 or GLX is enabled, make sure they are usable. +check_glamor_requirements() { if test "x$enable_opengl" = xno; then return 0 fi +need_glamor=no + +if test "x$enable_glx" = xdri; then +need_glamor=yes +fi + case "$with_egl_platforms" in -*drm*) -;; - *) -AC_MSG_ERROR([--with-egl-platforms=drm is required to build the $1 driver.]) +*x11*) +need_glamor=yes ;; esac -if test "x$enable_gbm" != xyes; then -AC_MSG_ERROR([--enable-gbm is required to build the $1 driver.]) + +if test "x$need_glamor" = xyes; then +suffix="is required for X acceleration with the $1 driver." + +if test "x$enable_gbm" != xyes; then +AC_MSG_ERROR([--enable-gbm $suffix]) +fi + +case "$with_egl_platforms" in +*drm*) +;; +*) +AC_MSG_ERROR([--with-egl-platforms=x11,drm $suffix]) +;; +esac fi } radeon_llvm_check() { if test ${LLVM_VERSION_INT} -lt 307; then amdgpu_llvm_target_name='r600' else amdgpu_llvm_target_name='amdgpu' fi llvm_check_version_for $2 $3 $4 $1 @@ -2427,21 +2444,21 @@ if test -n "$with_gallium_drivers"; then radeon_gallium_llvm_check "r600g" "3" "6" "0" LLVM_COMPONENTS="${LLVM_COMPONENTS} bitreader asmparser" fi ;; xradeonsi) HAVE_GALLIUM_RADEONSI=yes PKG_CHECK_MODULES([RADEON], [libdrm_radeon >= $LIBDRM_RADEON_REQUIRED]) PKG_CHECK_MODULES([AMDGPU], [libdrm_amdgpu >= $LIBDRM_AMDGPU_REQUIRED]) require_libdrm "radeonsi" radeon_gallium_llvm_check "radeonsi" "3" "6" "0" -require_egl_drm "radeonsi" +check_glamor_requirements "radeonsi" ;; xnouveau) HAVE_GALLIUM_NOUVEAU=yes PKG_CHECK_MODULES([NOUVEAU], [libdrm_nouveau >= $LIBDRM_NOUVEAU_REQUIRED]) require_libdrm "nouveau" ;; xfreedreno) HAVE_GALLIUM_FREEDRENO=yes PKG_CHECK_MODULES([FREEDRENO], [libdrm_freedreno >= $LIBDRM_FREEDRENO_REQUIRED]) require_libdrm "freedreno" @@ -2478,21 +2495,21 @@ if test -n "$with_gallium_drivers"; then require_libdrm "vc4" PKG_CHECK_MODULES([SIMPENROSE], [simpenrose], [USE_VC4_SIMULATOR=yes; DEFINES="$DEFINES -DUSE_VC4_SIMULATOR"], [USE_VC4_SIMULATOR=no]) ;; xvirgl) HAVE_GALLIUM_VIRGL=yes require_libdrm "virgl" -require_egl_drm "virgl" +check_glamor_requirements "virgl" ;; *) AC_MSG_ERROR([Unknown Gallium driver: $driver]) ;; esac done fi if test "x$HAVE_RADEON_VULKAN" = "xyes"; then radeon_llvm_check "radv" "3" "9" "0" -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/4] configure.ac: print whether GBM is enabled
From: Marek Olšák--- configure.ac | 5 + 1 file changed, 5 insertions(+) diff --git a/configure.ac b/configure.ac index 8e779d4..bc9b732 100644 --- a/configure.ac +++ b/configure.ac @@ -2860,20 +2860,25 @@ if test "$enable_egl" = yes; then egl_drivers="" if test "x$HAVE_EGL_DRIVER_DRI2" != "x"; then egl_drivers="$egl_drivers builtin:egl_dri2" fi if test "x$HAVE_EGL_DRIVER_DRI3" != "x"; then egl_drivers="$egl_drivers builtin:egl_dri3" fi echo "EGL drivers:$egl_drivers" fi +if test "x$enable_gbm" = xyes; then +echo "GBM: yes" +else +echo "GBM: no" +fi # Vulkan echo "" if test "x$VULKAN_DRIVERS" != x; then echo "Vulkan drivers: $VULKAN_DRIVERS" echo "Vulkan ICD dir: $VULKAN_ICD_INSTALL_DIR" else echo "Vulkan drivers: no" fi -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/4] configure.ac: enable EGL platform DRM if GBM is enabled
From: Marek Olšáksince GBM is enabled by default, this is also enabled by default the whitespace changes remove tabs --- configure.ac | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/configure.ac b/configure.ac index 3431a5d..12c8165 100644 --- a/configure.ac +++ b/configure.ac @@ -2010,23 +2010,27 @@ AC_SUBST([EGL_CLIENT_APIS]) dnl dnl EGL Platforms configuration dnl AC_ARG_WITH([egl-platforms], [AS_HELP_STRING([--with-egl-platforms@<:@=DIRS...@:>@], [comma delimited native platforms libEGL supports, e.g. "x11,drm" @<:@default=auto@:>@])], [with_egl_platforms="$withval"], [if test "x$enable_egl" = xyes; then - with_egl_platforms="x11" +if test "x$enable_gbm" = xyes; then + with_egl_platforms="x11,drm" +else + with_egl_platforms="x11" +fi else - with_egl_platforms="" +with_egl_platforms="" fi]) if test "x$with_egl_platforms" != "x" -a "x$enable_egl" != xyes; then AC_MSG_ERROR([cannot build egl state tracker without EGL library]) fi PKG_CHECK_MODULES([WAYLAND_SCANNER], [wayland-scanner], WAYLAND_SCANNER=`$PKG_CONFIG --variable=wayland_scanner wayland-scanner`, WAYLAND_SCANNER='') if test "x$WAYLAND_SCANNER" = x; then -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/5] intel: genxml: add SAMPLER_BORDER_COLOR_STATE structures
Thanks for the sandy bridge doc link. With all of the extra MBZ removed, this patch is Reviewed-by: Jason EkstrandOn Mon, Oct 17, 2016 at 11:39 AM, Lionel Landwerlin wrote: > On Mon, 2016-10-17 at 10:56 -0700, Jason Ekstrand wrote: > > > > > > On Mon, Oct 17, 2016 at 8:46 AM, Lionel Landwerlin > .com> wrote: > > > Signed-off-by: Lionel Landwerlin > > > --- > > > src/intel/genxml/gen6.xml | 32 > > > src/intel/genxml/gen7.xml | 12 > > > src/intel/genxml/gen75.xml | 40 > > > > > > src/intel/genxml/gen8.xml | 12 > > > src/intel/genxml/gen9.xml | 12 > > > 5 files changed, 108 insertions(+) > > > > > > diff --git a/src/intel/genxml/gen6.xml b/src/intel/genxml/gen6.xml > > > index 211716b..7ba8954 100644 > > > --- a/src/intel/genxml/gen6.xml > > > +++ b/src/intel/genxml/gen6.xml > > > @@ -372,6 +372,38 @@ > > > > > > > > > > > > + > > > + > > type="uint"/> > > > + > > type="uint"/> > > > + > > type="uint"/> > > > + > > type="uint"/> > > > + > > > + > > type="float"/> > > > + > > type="float"/> > > > + > > type="float"/> > > > + > > type="float"/> > > > + > > > + > > type="uint"/> > > > + > > type="uint"/> > > > + > > type="uint"/> > > > + > > type="uint"/> > > > + > > > + > > type="uint"/> > > > + > > type="uint"/> > > > + > > type="uint"/> > > > + > > type="uint"/> > > > + > > > + > > type="int"/> > > > + > > type="int"/> > > > + > > type="int"/> > > > + > > type="int"/> > > > + > > > + > > type="int"/> > > > + > > type="int"/> > > > + > > type="int"/> > > > + > > type="int"/> > > > + > > > > Are there docs for this anywhere or did you just pull it out of the > > gen6 GL code? > > > > Yes, there are but indeed not in the PRMs. > > > > > + > > > > > > > > type="bool"/> > > > > > type="uint"> > > > diff --git a/src/intel/genxml/gen7.xml b/src/intel/genxml/gen7.xml > > > index eabb244..a950603 100644 > > > --- a/src/intel/genxml/gen7.xml > > > +++ b/src/intel/genxml/gen7.xml > > > @@ -428,6 +428,18 @@ > > > > > type="u4.8"/> > > > > > > > > > + > > > + > > type="uint"/> > > > + > > type="uint"/> > > > + > > type="uint"/> > > > + > > type="uint"/> > > > + > > > + > > type="float"/> > > > + > > type="float"/> > > > + > > type="float"/> > > > + > > type="float"/> > > > + > > > + > > > > > > > > type="bool"/> > > > > > type="uint"> > > > diff --git a/src/intel/genxml/gen75.xml > > > b/src/intel/genxml/gen75.xml > > > index 27a12cb..42f66cb 100644 > > > --- a/src/intel/genxml/gen75.xml > > > +++ b/src/intel/genxml/gen75.xml > > > @@ -438,6 +438,46 @@ > > > > > type="u4.8"/> > > > > > > > > > + > > > + > > type="float"/> > > > + > > type="float"/> > > > + > > type="float"/> > > > + > > type="float"/> > > > + > > > + > > type="uint"/> > > > + > > type="uint"/> > > > + > > type="uint"/> > > > + > > type="uint"/> > > > + > > type="uint"/> > > > + > > type="uint"/> > > > + > > type="uint"/> > > > + > > type="uint"/> > > > + > > type="uint"/> > > > + > > type="uint"/> > > > + > > type="uint"/> > > > + > > type="uint"/> > > > > In the rest of the XML, MBZ fields simply don't exist. The packing > > functions will automatically zero anything that doesn't have data in > > it. I'm not sure if that's true for whole dwords but if it's not, we > > should fix that. In other words, I believe the correct solution is > > to just delete these and let "Border Color 8bit Red" start super-late > > in the packet. > > > > > + > > > + > > type="uint"/> > > > + > > type="uint"/> > > > + > > type="uint"/> > > > + > > type="uint"/> > > > + > > end="575" type="uint"/> > > > + > > end="607" type="uint"/> > > > + > > end="639" type="uint"/> > > > > These can go as well > > > > > + > > > + > > type="uint"/> > > > + > > type="uint"/> > > > + > > end="575" type="uint"/> > > > > and this > > > > > + > > type="uint"/> > > > + > > type="uint"/> > > > + > > end="639" type="uint"/> > > > > and this > > > > > + > > > + > > type="uint"/> > > > + > > type="uint"/> > > > + > > type="uint"/> > > > + > > type="uint"/> > > > + > > > + > > > > > > > > type="bool"/> > > > > > type="uint"> > > > diff --git a/src/intel/genxml/gen8.xml b/src/intel/genxml/gen8.xml > > > index ee62614..a281f01 100644 > > > --- a/src/intel/genxml/gen8.xml > > > +++ b/src/intel/genxml/gen8.xml > > > @@ -358,6 +358,18 @@ > > > > > type="s1.6"/> > > > > > > > > > + > > > + > > type="float"/> > > > + > > type="float"/> > > > + > > type="float"/> > > > + > > type="float"/> > > > + > > > + > >
Re: [Mesa-dev] [PATCH 10/25] mesa/nir/radv/anv: add shader_info param to nir_shader builder
On Tue, Oct 18, 2016 at 2:18 PM, Jason Ekstrandwrote: > > > On Tue, Oct 18, 2016 at 2:06 PM, Timothy Arceri < > timothy.arc...@collabora.com> wrote: > >> On Tue, 2016-10-18 at 08:47 -0700, Jason Ekstrand wrote: >> > On Mon, Oct 17, 2016 at 11:12 PM, Timothy Arceri > > abora.com> wrote: >> > > And pass in a pointer to the shader info in gl_program for ARB >> > > programs. >> > > --- >> > > src/amd/vulkan/radv_meta_blit.c | 12 >> > > src/amd/vulkan/radv_meta_blit2d.c | 12 >> > > src/amd/vulkan/radv_meta_buffer.c | 6 -- >> > > src/amd/vulkan/radv_meta_bufimage.c | 3 ++- >> > > src/amd/vulkan/radv_meta_clear.c | 12 >> > > src/amd/vulkan/radv_meta_decompress.c | 6 -- >> > > src/amd/vulkan/radv_meta_fast_clear.c | 6 -- >> > > src/amd/vulkan/radv_meta_resolve.c| 6 -- >> > > src/amd/vulkan/radv_meta_resolve_cs.c | 2 +- >> > > src/amd/vulkan/radv_pipeline.c| 2 +- >> > > src/compiler/nir/nir_builder.h| 5 +++-- >> > > src/compiler/nir/tests/control_flow_tests.cpp | 3 ++- >> > > src/gallium/auxiliary/nir/tgsi_to_nir.c | 2 +- >> > > src/intel/blorp/blorp_blit.c | 2 +- >> > > src/intel/blorp/blorp_clear.c | 2 +- >> > > src/mesa/drivers/dri/i965/brw_program.c | 2 +- >> > > src/mesa/drivers/dri/i965/brw_program.h | 2 +- >> > > src/mesa/drivers/dri/i965/brw_tcs.c | 3 ++- >> > > src/mesa/program/prog_to_nir.c| 5 +++-- >> > > src/mesa/program/prog_to_nir.h| 2 +- >> > > 20 files changed, 60 insertions(+), 35 deletions(-) >> > > >> > > diff --git a/src/amd/vulkan/radv_meta_blit.c >> > > b/src/amd/vulkan/radv_meta_blit.c >> > > index bfbf880..3eda43e 100644 >> > > --- a/src/amd/vulkan/radv_meta_blit.c >> > > +++ b/src/amd/vulkan/radv_meta_blit.c >> > > @@ -37,7 +37,8 @@ build_nir_vertex_shader(void) >> > > const struct glsl_type *vec4 = glsl_vec4_type(); >> > > nir_builder b; >> > > >> > > - nir_builder_init_simple_shader(, NULL, >> > > MESA_SHADER_VERTEX, NULL); >> > > + nir_builder_init_simple_shader(, NULL, >> > > MESA_SHADER_VERTEX, NULL, >> > > + NULL); >> > > b.shader->info->name = ralloc_strdup(b.shader, >> > > "meta_blit_vs"); >> > > >> > > nir_variable *pos_in = nir_variable_create(b.shader, >> > > nir_var_shader_in, >> > > @@ -67,7 +68,8 @@ build_nir_copy_fragment_shader(enum >> > > glsl_sampler_dim tex_dim) >> > > const struct glsl_type *vec4 = glsl_vec4_type(); >> > > nir_builder b; >> > > >> > > - nir_builder_init_simple_shader(, NULL, >> > > MESA_SHADER_FRAGMENT, NULL); >> > > + nir_builder_init_simple_shader(, NULL, >> > > MESA_SHADER_FRAGMENT, NULL, >> > > + NULL); >> > > >> > > sprintf(shader_name, "meta_blit_fs.%d", tex_dim); >> > > b.shader->info->name = ralloc_strdup(b.shader, >> > > shader_name); >> > > @@ -121,7 +123,8 @@ build_nir_copy_fragment_shader_depth(enum >> > > glsl_sampler_dim tex_dim) >> > > const struct glsl_type *vec4 = glsl_vec4_type(); >> > > nir_builder b; >> > > >> > > - nir_builder_init_simple_shader(, NULL, >> > > MESA_SHADER_FRAGMENT, NULL); >> > > + nir_builder_init_simple_shader(, NULL, >> > > MESA_SHADER_FRAGMENT, NULL, >> > > + NULL); >> > > >> > > sprintf(shader_name, "meta_blit_depth_fs.%d", tex_dim); >> > > b.shader->info->name = ralloc_strdup(b.shader, >> > > shader_name); >> > > @@ -175,7 +178,8 @@ build_nir_copy_fragment_shader_stencil(enum >> > > glsl_sampler_dim tex_dim) >> > > const struct glsl_type *vec4 = glsl_vec4_type(); >> > > nir_builder b; >> > > >> > > - nir_builder_init_simple_shader(, NULL, >> > > MESA_SHADER_FRAGMENT, NULL); >> > > + nir_builder_init_simple_shader(, NULL, >> > > MESA_SHADER_FRAGMENT, NULL, >> > > + NULL); >> > > >> > > sprintf(shader_name, "meta_blit_stencil_fs.%d", tex_dim); >> > > b.shader->info->name = ralloc_strdup(b.shader, >> > > shader_name); >> > > diff --git a/src/amd/vulkan/radv_meta_blit2d.c >> > > b/src/amd/vulkan/radv_meta_blit2d.c >> > > index 6e92f80..bed03a3 100644 >> > > --- a/src/amd/vulkan/radv_meta_blit2d.c >> > > +++ b/src/amd/vulkan/radv_meta_blit2d.c >> > > @@ -438,7 +438,8 @@ build_nir_vertex_shader(void) >> > > const struct glsl_type *vec2 = >> > > glsl_vector_type(GLSL_TYPE_FLOAT, 2); >> > > nir_builder b; >> > > >> > > - nir_builder_init_simple_shader(, NULL, >> > > MESA_SHADER_VERTEX, NULL); >> > > + nir_builder_init_simple_shader(, NULL, >> > > MESA_SHADER_VERTEX, NULL, >> > > +
Re: [Mesa-dev] [PATCH 10/25] mesa/nir/radv/anv: add shader_info param to nir_shader builder
On Tue, Oct 18, 2016 at 2:06 PM, Timothy Arceri < timothy.arc...@collabora.com> wrote: > On Tue, 2016-10-18 at 08:47 -0700, Jason Ekstrand wrote: > > On Mon, Oct 17, 2016 at 11:12 PM, Timothy Arceri> abora.com> wrote: > > > And pass in a pointer to the shader info in gl_program for ARB > > > programs. > > > --- > > > src/amd/vulkan/radv_meta_blit.c | 12 > > > src/amd/vulkan/radv_meta_blit2d.c | 12 > > > src/amd/vulkan/radv_meta_buffer.c | 6 -- > > > src/amd/vulkan/radv_meta_bufimage.c | 3 ++- > > > src/amd/vulkan/radv_meta_clear.c | 12 > > > src/amd/vulkan/radv_meta_decompress.c | 6 -- > > > src/amd/vulkan/radv_meta_fast_clear.c | 6 -- > > > src/amd/vulkan/radv_meta_resolve.c| 6 -- > > > src/amd/vulkan/radv_meta_resolve_cs.c | 2 +- > > > src/amd/vulkan/radv_pipeline.c| 2 +- > > > src/compiler/nir/nir_builder.h| 5 +++-- > > > src/compiler/nir/tests/control_flow_tests.cpp | 3 ++- > > > src/gallium/auxiliary/nir/tgsi_to_nir.c | 2 +- > > > src/intel/blorp/blorp_blit.c | 2 +- > > > src/intel/blorp/blorp_clear.c | 2 +- > > > src/mesa/drivers/dri/i965/brw_program.c | 2 +- > > > src/mesa/drivers/dri/i965/brw_program.h | 2 +- > > > src/mesa/drivers/dri/i965/brw_tcs.c | 3 ++- > > > src/mesa/program/prog_to_nir.c| 5 +++-- > > > src/mesa/program/prog_to_nir.h| 2 +- > > > 20 files changed, 60 insertions(+), 35 deletions(-) > > > > > > diff --git a/src/amd/vulkan/radv_meta_blit.c > > > b/src/amd/vulkan/radv_meta_blit.c > > > index bfbf880..3eda43e 100644 > > > --- a/src/amd/vulkan/radv_meta_blit.c > > > +++ b/src/amd/vulkan/radv_meta_blit.c > > > @@ -37,7 +37,8 @@ build_nir_vertex_shader(void) > > > const struct glsl_type *vec4 = glsl_vec4_type(); > > > nir_builder b; > > > > > > - nir_builder_init_simple_shader(, NULL, > > > MESA_SHADER_VERTEX, NULL); > > > + nir_builder_init_simple_shader(, NULL, > > > MESA_SHADER_VERTEX, NULL, > > > + NULL); > > > b.shader->info->name = ralloc_strdup(b.shader, > > > "meta_blit_vs"); > > > > > > nir_variable *pos_in = nir_variable_create(b.shader, > > > nir_var_shader_in, > > > @@ -67,7 +68,8 @@ build_nir_copy_fragment_shader(enum > > > glsl_sampler_dim tex_dim) > > > const struct glsl_type *vec4 = glsl_vec4_type(); > > > nir_builder b; > > > > > > - nir_builder_init_simple_shader(, NULL, > > > MESA_SHADER_FRAGMENT, NULL); > > > + nir_builder_init_simple_shader(, NULL, > > > MESA_SHADER_FRAGMENT, NULL, > > > + NULL); > > > > > > sprintf(shader_name, "meta_blit_fs.%d", tex_dim); > > > b.shader->info->name = ralloc_strdup(b.shader, > > > shader_name); > > > @@ -121,7 +123,8 @@ build_nir_copy_fragment_shader_depth(enum > > > glsl_sampler_dim tex_dim) > > > const struct glsl_type *vec4 = glsl_vec4_type(); > > > nir_builder b; > > > > > > - nir_builder_init_simple_shader(, NULL, > > > MESA_SHADER_FRAGMENT, NULL); > > > + nir_builder_init_simple_shader(, NULL, > > > MESA_SHADER_FRAGMENT, NULL, > > > + NULL); > > > > > > sprintf(shader_name, "meta_blit_depth_fs.%d", tex_dim); > > > b.shader->info->name = ralloc_strdup(b.shader, > > > shader_name); > > > @@ -175,7 +178,8 @@ build_nir_copy_fragment_shader_stencil(enum > > > glsl_sampler_dim tex_dim) > > > const struct glsl_type *vec4 = glsl_vec4_type(); > > > nir_builder b; > > > > > > - nir_builder_init_simple_shader(, NULL, > > > MESA_SHADER_FRAGMENT, NULL); > > > + nir_builder_init_simple_shader(, NULL, > > > MESA_SHADER_FRAGMENT, NULL, > > > + NULL); > > > > > > sprintf(shader_name, "meta_blit_stencil_fs.%d", tex_dim); > > > b.shader->info->name = ralloc_strdup(b.shader, > > > shader_name); > > > diff --git a/src/amd/vulkan/radv_meta_blit2d.c > > > b/src/amd/vulkan/radv_meta_blit2d.c > > > index 6e92f80..bed03a3 100644 > > > --- a/src/amd/vulkan/radv_meta_blit2d.c > > > +++ b/src/amd/vulkan/radv_meta_blit2d.c > > > @@ -438,7 +438,8 @@ build_nir_vertex_shader(void) > > > const struct glsl_type *vec2 = > > > glsl_vector_type(GLSL_TYPE_FLOAT, 2); > > > nir_builder b; > > > > > > - nir_builder_init_simple_shader(, NULL, > > > MESA_SHADER_VERTEX, NULL); > > > + nir_builder_init_simple_shader(, NULL, > > > MESA_SHADER_VERTEX, NULL, > > > + NULL); > > > b.shader->info->name = ralloc_strdup(b.shader, > > > "meta_blit_vs"); > > > > > > nir_variable *pos_in = nir_variable_create(b.shader, > > > nir_var_shader_in, > >
[Mesa-dev] [PATCH 1/2] st/nine: Fix leak with integer and boolean constants
Leak introduced by: a83dce01284f220b1bf932774730e13fca6cdd20 The patch also moves the part to release changed.vs_const_i and changed.vs_const_b before the if (!cb.buffer_size) check, to avoid reuploading every draw call if integer or boolean constants are dirty, but the shaders use no constants. Signed-off-by: Axel Davy--- src/gallium/state_trackers/nine/nine_state.c | 39 +--- 1 file changed, 18 insertions(+), 21 deletions(-) diff --git a/src/gallium/state_trackers/nine/nine_state.c b/src/gallium/state_trackers/nine/nine_state.c index f6bf51e..ea72c77 100644 --- a/src/gallium/state_trackers/nine/nine_state.c +++ b/src/gallium/state_trackers/nine/nine_state.c @@ -126,7 +126,6 @@ prepare_vs_constants_userbuf_swvp(struct NineDevice9 *device) cb.user_buffer = state->vs_const_i; state->pipe.cb2_swvp = cb; -state->changed.vs_const_i = 0; } if (state->changed.vs_const_b || state->changed.group & NINE_STATE_SWVP) { @@ -138,7 +137,6 @@ prepare_vs_constants_userbuf_swvp(struct NineDevice9 *device) cb.user_buffer = state->vs_const_b; state->pipe.cb3_swvp = cb; -state->changed.vs_const_b = 0; } if (!device->driver_caps.user_cbufs) { @@ -236,14 +234,30 @@ prepare_vs_constants_userbuf(struct NineDevice9 *device) if (state->changed.vs_const_i || state->changed.group & NINE_STATE_SWVP) { int *idst = (int *)>vs_const_f[4 * device->max_vs_const_f]; memcpy(idst, state->vs_const_i, NINE_MAX_CONST_I * sizeof(int[4])); -state->changed.vs_const_i = 0; } if (state->changed.vs_const_b || state->changed.group & NINE_STATE_SWVP) { int *idst = (int *)>vs_const_f[4 * device->max_vs_const_f]; uint32_t *bdst = (uint32_t *)[4 * NINE_MAX_CONST_I]; memcpy(bdst, state->vs_const_b, NINE_MAX_CONST_B * sizeof(BOOL)); -state->changed.vs_const_b = 0; +} + +if (device->state.changed.vs_const_i) { +struct nine_range *r = device->state.changed.vs_const_i; +struct nine_range *p = r; +while (p->next) +p = p->next; +nine_range_pool_put_chain(>range_pool, r, p); +device->state.changed.vs_const_i = NULL; +} + +if (device->state.changed.vs_const_b) { +struct nine_range *r = device->state.changed.vs_const_b; +struct nine_range *p = r; +while (p->next) +p = p->next; +nine_range_pool_put_chain(>range_pool, r, p); +device->state.changed.vs_const_b = NULL; } if (!cb.buffer_size) @@ -290,23 +304,6 @@ prepare_vs_constants_userbuf(struct NineDevice9 *device) device->state.changed.vs_const_f = NULL; } -if (device->state.changed.vs_const_i) { -struct nine_range *r = device->state.changed.vs_const_i; -struct nine_range *p = r; -while (p->next) -p = p->next; -nine_range_pool_put_chain(>range_pool, r, p); -device->state.changed.vs_const_i = NULL; -} - -if (device->state.changed.vs_const_b) { -struct nine_range *r = device->state.changed.vs_const_b; -struct nine_range *p = r; -while (p->next) -p = p->next; -nine_range_pool_put_chain(>range_pool, r, p); -device->state.changed.vs_const_b = NULL; -} state->changed.group &= ~NINE_STATE_VS_CONST; state->commit |= NINE_STATE_COMMIT_CONST_VS; } -- 2.10.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] st/nine: Fix leak at device dtor
The datastructures to track dirty constants weren't freed. Signed-off-by: Axel Davy--- src/gallium/state_trackers/nine/device9.c | 19 +++ 1 file changed, 19 insertions(+) diff --git a/src/gallium/state_trackers/nine/device9.c b/src/gallium/state_trackers/nine/device9.c index c0a3c39..d7f3a40 100644 --- a/src/gallium/state_trackers/nine/device9.c +++ b/src/gallium/state_trackers/nine/device9.c @@ -481,6 +481,8 @@ void NineDevice9_dtor( struct NineDevice9 *This ) { unsigned i; +struct nine_range *r; +struct nine_range_pool *pool = >base.device->range_pool; DBG("This=%p\n", This); @@ -514,6 +516,23 @@ NineDevice9_dtor( struct NineDevice9 *This ) FREE(This->state.vs_const_b); FREE(This->state.vs_const_f_swvp); +if (This->state.changed.ps_const_f) { +for (r = This->state.changed.ps_const_f; r->next; r = r->next); +nine_range_pool_put_chain(pool, This->state.changed.ps_const_f, r); +} +if (This->state.changed.vs_const_f) { +for (r = This->state.changed.vs_const_f; r->next; r = r->next); +nine_range_pool_put_chain(pool, This->state.changed.vs_const_f, r); +} +if (This->state.changed.vs_const_i) { +for (r = This->state.changed.vs_const_i; r->next; r = r->next); +nine_range_pool_put_chain(pool, This->state.changed.vs_const_i, r); +} +if (This->state.changed.vs_const_b) { +for (r = This->state.changed.vs_const_b; r->next; r = r->next); +nine_range_pool_put_chain(pool, This->state.changed.vs_const_b, r); +} + if (This->swapchains) { for (i = 0; i < This->nswapchains; ++i) if (This->swapchains[i]) -- 2.10.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 10/25] mesa/nir/radv/anv: add shader_info param to nir_shader builder
On Tue, 2016-10-18 at 08:47 -0700, Jason Ekstrand wrote: > On Mon, Oct 17, 2016 at 11:12 PM, Timothy Arceriabora.com> wrote: > > And pass in a pointer to the shader info in gl_program for ARB > > programs. > > --- > > src/amd/vulkan/radv_meta_blit.c | 12 > > src/amd/vulkan/radv_meta_blit2d.c | 12 > > src/amd/vulkan/radv_meta_buffer.c | 6 -- > > src/amd/vulkan/radv_meta_bufimage.c | 3 ++- > > src/amd/vulkan/radv_meta_clear.c | 12 > > src/amd/vulkan/radv_meta_decompress.c | 6 -- > > src/amd/vulkan/radv_meta_fast_clear.c | 6 -- > > src/amd/vulkan/radv_meta_resolve.c | 6 -- > > src/amd/vulkan/radv_meta_resolve_cs.c | 2 +- > > src/amd/vulkan/radv_pipeline.c | 2 +- > > src/compiler/nir/nir_builder.h | 5 +++-- > > src/compiler/nir/tests/control_flow_tests.cpp | 3 ++- > > src/gallium/auxiliary/nir/tgsi_to_nir.c | 2 +- > > src/intel/blorp/blorp_blit.c | 2 +- > > src/intel/blorp/blorp_clear.c | 2 +- > > src/mesa/drivers/dri/i965/brw_program.c | 2 +- > > src/mesa/drivers/dri/i965/brw_program.h | 2 +- > > src/mesa/drivers/dri/i965/brw_tcs.c | 3 ++- > > src/mesa/program/prog_to_nir.c | 5 +++-- > > src/mesa/program/prog_to_nir.h | 2 +- > > 20 files changed, 60 insertions(+), 35 deletions(-) > > > > diff --git a/src/amd/vulkan/radv_meta_blit.c > > b/src/amd/vulkan/radv_meta_blit.c > > index bfbf880..3eda43e 100644 > > --- a/src/amd/vulkan/radv_meta_blit.c > > +++ b/src/amd/vulkan/radv_meta_blit.c > > @@ -37,7 +37,8 @@ build_nir_vertex_shader(void) > > const struct glsl_type *vec4 = glsl_vec4_type(); > > nir_builder b; > > > > - nir_builder_init_simple_shader(, NULL, > > MESA_SHADER_VERTEX, NULL); > > + nir_builder_init_simple_shader(, NULL, > > MESA_SHADER_VERTEX, NULL, > > + NULL); > > b.shader->info->name = ralloc_strdup(b.shader, > > "meta_blit_vs"); > > > > nir_variable *pos_in = nir_variable_create(b.shader, > > nir_var_shader_in, > > @@ -67,7 +68,8 @@ build_nir_copy_fragment_shader(enum > > glsl_sampler_dim tex_dim) > > const struct glsl_type *vec4 = glsl_vec4_type(); > > nir_builder b; > > > > - nir_builder_init_simple_shader(, NULL, > > MESA_SHADER_FRAGMENT, NULL); > > + nir_builder_init_simple_shader(, NULL, > > MESA_SHADER_FRAGMENT, NULL, > > + NULL); > > > > sprintf(shader_name, "meta_blit_fs.%d", tex_dim); > > b.shader->info->name = ralloc_strdup(b.shader, > > shader_name); > > @@ -121,7 +123,8 @@ build_nir_copy_fragment_shader_depth(enum > > glsl_sampler_dim tex_dim) > > const struct glsl_type *vec4 = glsl_vec4_type(); > > nir_builder b; > > > > - nir_builder_init_simple_shader(, NULL, > > MESA_SHADER_FRAGMENT, NULL); > > + nir_builder_init_simple_shader(, NULL, > > MESA_SHADER_FRAGMENT, NULL, > > + NULL); > > > > sprintf(shader_name, "meta_blit_depth_fs.%d", tex_dim); > > b.shader->info->name = ralloc_strdup(b.shader, > > shader_name); > > @@ -175,7 +178,8 @@ build_nir_copy_fragment_shader_stencil(enum > > glsl_sampler_dim tex_dim) > > const struct glsl_type *vec4 = glsl_vec4_type(); > > nir_builder b; > > > > - nir_builder_init_simple_shader(, NULL, > > MESA_SHADER_FRAGMENT, NULL); > > + nir_builder_init_simple_shader(, NULL, > > MESA_SHADER_FRAGMENT, NULL, > > + NULL); > > > > sprintf(shader_name, "meta_blit_stencil_fs.%d", tex_dim); > > b.shader->info->name = ralloc_strdup(b.shader, > > shader_name); > > diff --git a/src/amd/vulkan/radv_meta_blit2d.c > > b/src/amd/vulkan/radv_meta_blit2d.c > > index 6e92f80..bed03a3 100644 > > --- a/src/amd/vulkan/radv_meta_blit2d.c > > +++ b/src/amd/vulkan/radv_meta_blit2d.c > > @@ -438,7 +438,8 @@ build_nir_vertex_shader(void) > > const struct glsl_type *vec2 = > > glsl_vector_type(GLSL_TYPE_FLOAT, 2); > > nir_builder b; > > > > - nir_builder_init_simple_shader(, NULL, > > MESA_SHADER_VERTEX, NULL); > > + nir_builder_init_simple_shader(, NULL, > > MESA_SHADER_VERTEX, NULL, > > + NULL); > > b.shader->info->name = ralloc_strdup(b.shader, > > "meta_blit_vs"); > > > > nir_variable *pos_in = nir_variable_create(b.shader, > > nir_var_shader_in, > > @@ -573,7 +574,8 @@ build_nir_copy_fragment_shader(struct > > radv_device *device, > > const struct glsl_type *vec2 = > > glsl_vector_type(GLSL_TYPE_FLOAT, 2); > > nir_builder b; > > > > - nir_builder_init_simple_shader(, NULL, > > MESA_SHADER_FRAGMENT, NULL); > > +
Re: [Mesa-dev] [PATCH 2/2] i965: Reorder PCI ID list to match release order
Quoting Ben Widawsky (2016-10-18 13:50:08) > I have some OCD... > > Signed-off-by: Ben Widawsky> --- > include/pci_ids/i965_pci_ids.h | 18 +- > 1 file changed, 9 insertions(+), 9 deletions(-) > > diff --git a/include/pci_ids/i965_pci_ids.h b/include/pci_ids/i965_pci_ids.h > index a93228d..e482007 100644 > --- a/include/pci_ids/i965_pci_ids.h > +++ b/include/pci_ids/i965_pci_ids.h > @@ -109,6 +109,10 @@ CHIPSET(0x162A, bdw_gt3, "Intel(R) Iris Pro P6300 > (Broadwell GT3e)") > CHIPSET(0x162B, bdw_gt3, "Intel(R) Iris 6100 (Broadwell GT3)") > CHIPSET(0x162D, bdw_gt3, "Intel(R) Broadwell GT3") > CHIPSET(0x162E, bdw_gt3, "Intel(R) Broadwell GT3") > +CHIPSET(0x22B0, chv, "Intel(R) HD Graphics (Cherrytrail)") > +CHIPSET(0x22B1, chv, "Intel(R) HD Graphics XXX (Braswell)") /* > Overridden in brw_get_renderer_string */ > +CHIPSET(0x22B2, chv, "Intel(R) HD Graphics (Cherryview)") > +CHIPSET(0x22B3, chv, "Intel(R) HD Graphics (Cherryview)") > CHIPSET(0x1902, skl_gt1, "Intel(R) HD Graphics 510 (Skylake GT1)") > CHIPSET(0x1906, skl_gt1, "Intel(R) HD Graphics 510 (Skylake GT1)") > CHIPSET(0x190A, skl_gt1, "Intel(R) Skylake GT1") > @@ -134,6 +138,11 @@ CHIPSET(0x1932, skl_gt4, "Intel(R) Iris Pro Graphics 580 > (Skylake GT4e)") > CHIPSET(0x193A, skl_gt4, "Intel(R) Iris Pro Graphics P580 (Skylake GT4e)") > CHIPSET(0x193B, skl_gt4, "Intel(R) Iris Pro Graphics 580 (Skylake GT4e)") > CHIPSET(0x193D, skl_gt4, "Intel(R) Iris Pro Graphics P580 (Skylake GT4e)") > +CHIPSET(0x0A84, bxt, "Intel(R) HD Graphics (Broxton)") > +CHIPSET(0x1A84, bxt, "Intel(R) HD Graphics (Broxton)") > +CHIPSET(0x1A85, bxt_2x6, "Intel(R) HD Graphics (Broxton 2x6)") > +CHIPSET(0x5A84, bxt, "Intel(R) HD Graphics 505 (Broxton)") > +CHIPSET(0x5A85, bxt_2x6, "Intel(R) HD Graphics 500 (Broxton 2x6)") > CHIPSET(0x5902, kbl_gt1, "Intel(R) Kabylake GT1") > CHIPSET(0x5906, kbl_gt1, "Intel(R) Kabylake GT1") > CHIPSET(0x590A, kbl_gt1, "Intel(R) Kabylake GT1") > @@ -154,12 +163,3 @@ CHIPSET(0x5923, kbl_gt3, "Intel(R) Kabylake GT3") > CHIPSET(0x5926, kbl_gt3, "Intel(R) Kabylake GT3") > CHIPSET(0x5927, kbl_gt3, "Intel(R) Kabylake GT3") > CHIPSET(0x593B, kbl_gt4, "Intel(R) Kabylake GT4") > -CHIPSET(0x22B0, chv, "Intel(R) HD Graphics (Cherrytrail)") > -CHIPSET(0x22B1, chv, "Intel(R) HD Graphics XXX (Braswell)") /* > Overridden in brw_get_renderer_string */ > -CHIPSET(0x22B2, chv, "Intel(R) HD Graphics (Cherryview)") > -CHIPSET(0x22B3, chv, "Intel(R) HD Graphics (Cherryview)") > -CHIPSET(0x0A84, bxt, "Intel(R) HD Graphics (Broxton)") > -CHIPSET(0x1A84, bxt, "Intel(R) HD Graphics (Broxton)") > -CHIPSET(0x1A85, bxt_2x6, "Intel(R) HD Graphics (Broxton 2x6)") > -CHIPSET(0x5A84, bxt, "Intel(R) HD Graphics 505 (Broxton)") > -CHIPSET(0x5A85, bxt_2x6, "Intel(R) HD Graphics 500 (Broxton 2x6)") > -- > 2.10.0 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev I approve of your OCD, Reviewed-by: Dylan Baker signature.asc Description: signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 6/6] radeonsi: rename prefixes from radeon to si
On Tue, Oct 18, 2016 at 11:54 AM, Nicolai Hähnlewrote: > Makes sense as a cleanup. At some point it would make sense to look into > sharing some stuff with radv instead. There's probably not a huge amount > because of the NIR/TGSI split, but still. I'm always for code sharing, but I don't know whether radv will ever be considered important outside of the radv camp. If somebody else wants to share code with radv, I'll gladly accept patches. Marek ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 21/22] wsi: swap srgb/unorm around.
On 19 October 2016 at 06:33, Jason Ekstrandwrote: > NAKish... I specifically put them in that order to *cause* talos to break. > If we're going to support both UNORM and sRGB, then applications need to > look at the formats they're getting and pick one intelligently rather than > just using the first thing they find (which Talos does) especially if that > app does their own gamma curvs. Apps that just grab the first thing they > find probably "want" sRGB encoding done for them. I've talked to the guys > at croteam and there is a Talos update in the pipe that fixes this. you might also need to talk to Sascha Willems. Dave. > > On Sun, Oct 16, 2016 at 9:24 PM, Dave Airlie wrote: >> >> From: Dave Airlie >> >> This prevents a Talos regression before radv >> starts using shared WSI. >> --- >> src/vulkan/wsi/wsi_common_x11.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/src/vulkan/wsi/wsi_common_x11.c >> b/src/vulkan/wsi/wsi_common_x11.c >> index b5832c6..241ef42 100644 >> --- a/src/vulkan/wsi/wsi_common_x11.c >> +++ b/src/vulkan/wsi/wsi_common_x11.c >> @@ -135,8 +135,8 @@ wsi_x11_get_connection(struct wsi_device *wsi_dev, >> } >> >> static const VkSurfaceFormatKHR formats[] = { >> - { .format = VK_FORMAT_B8G8R8A8_SRGB, }, >> { .format = VK_FORMAT_B8G8R8A8_UNORM, }, >> + { .format = VK_FORMAT_B8G8R8A8_SRGB, }, >> }; >> >> static const VkPresentModeKHR present_modes[] = { >> -- >> 2.5.5 >> >> ___ >> mesa-dev mailing list >> mesa-dev@lists.freedesktop.org >> https://lists.freedesktop.org/mailman/listinfo/mesa-dev > > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] i965: Add some APL and KBL SKU strings
We got a couple for products that exist on ark.intel.com, so let's just put them in now. Signed-off-by: Ben Widawsky--- include/pci_ids/i965_pci_ids.h | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/include/pci_ids/i965_pci_ids.h b/include/pci_ids/i965_pci_ids.h index 1566afd..a93228d 100644 --- a/include/pci_ids/i965_pci_ids.h +++ b/include/pci_ids/i965_pci_ids.h @@ -144,11 +144,11 @@ CHIPSET(0x5913, kbl_gt1_5, "Intel(R) Kabylake GT1.5") CHIPSET(0x5915, kbl_gt1_5, "Intel(R) Kabylake GT1.5") CHIPSET(0x5917, kbl_gt1_5, "Intel(R) Kabylake GT1.5") CHIPSET(0x5912, kbl_gt2, "Intel(R) Kabylake GT2") -CHIPSET(0x5916, kbl_gt2, "Intel(R) Kabylake GT2") +CHIPSET(0x5916, kbl_gt2, "Intel(R) HD Graphics 620 (Intel(R) Kabylake GT2)") CHIPSET(0x591A, kbl_gt2, "Intel(R) Kabylake GT2") CHIPSET(0x591B, kbl_gt2, "Intel(R) Kabylake GT2") CHIPSET(0x591D, kbl_gt2, "Intel(R) Kabylake GT2") -CHIPSET(0x591E, kbl_gt2, "Intel(R) Kabylake GT2") +CHIPSET(0x591E, kbl_gt2, "Intel(R) HD Graphics 615 (Kabylake GT2)") CHIPSET(0x5921, kbl_gt2, "Intel(R) Kabylake GT2F") CHIPSET(0x5923, kbl_gt3, "Intel(R) Kabylake GT3") CHIPSET(0x5926, kbl_gt3, "Intel(R) Kabylake GT3") @@ -161,5 +161,5 @@ CHIPSET(0x22B3, chv, "Intel(R) HD Graphics (Cherryview)") CHIPSET(0x0A84, bxt, "Intel(R) HD Graphics (Broxton)") CHIPSET(0x1A84, bxt, "Intel(R) HD Graphics (Broxton)") CHIPSET(0x1A85, bxt_2x6, "Intel(R) HD Graphics (Broxton 2x6)") -CHIPSET(0x5A84, bxt, "Intel(R) HD Graphics (Broxton)") -CHIPSET(0x5A85, bxt_2x6, "Intel(R) HD Graphics (Broxton 2x6)") +CHIPSET(0x5A84, bxt, "Intel(R) HD Graphics 505 (Broxton)") +CHIPSET(0x5A85, bxt_2x6, "Intel(R) HD Graphics 500 (Broxton 2x6)") -- 2.10.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] i965: Reorder PCI ID list to match release order
I have some OCD... Signed-off-by: Ben Widawsky--- include/pci_ids/i965_pci_ids.h | 18 +- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/include/pci_ids/i965_pci_ids.h b/include/pci_ids/i965_pci_ids.h index a93228d..e482007 100644 --- a/include/pci_ids/i965_pci_ids.h +++ b/include/pci_ids/i965_pci_ids.h @@ -109,6 +109,10 @@ CHIPSET(0x162A, bdw_gt3, "Intel(R) Iris Pro P6300 (Broadwell GT3e)") CHIPSET(0x162B, bdw_gt3, "Intel(R) Iris 6100 (Broadwell GT3)") CHIPSET(0x162D, bdw_gt3, "Intel(R) Broadwell GT3") CHIPSET(0x162E, bdw_gt3, "Intel(R) Broadwell GT3") +CHIPSET(0x22B0, chv, "Intel(R) HD Graphics (Cherrytrail)") +CHIPSET(0x22B1, chv, "Intel(R) HD Graphics XXX (Braswell)") /* Overridden in brw_get_renderer_string */ +CHIPSET(0x22B2, chv, "Intel(R) HD Graphics (Cherryview)") +CHIPSET(0x22B3, chv, "Intel(R) HD Graphics (Cherryview)") CHIPSET(0x1902, skl_gt1, "Intel(R) HD Graphics 510 (Skylake GT1)") CHIPSET(0x1906, skl_gt1, "Intel(R) HD Graphics 510 (Skylake GT1)") CHIPSET(0x190A, skl_gt1, "Intel(R) Skylake GT1") @@ -134,6 +138,11 @@ CHIPSET(0x1932, skl_gt4, "Intel(R) Iris Pro Graphics 580 (Skylake GT4e)") CHIPSET(0x193A, skl_gt4, "Intel(R) Iris Pro Graphics P580 (Skylake GT4e)") CHIPSET(0x193B, skl_gt4, "Intel(R) Iris Pro Graphics 580 (Skylake GT4e)") CHIPSET(0x193D, skl_gt4, "Intel(R) Iris Pro Graphics P580 (Skylake GT4e)") +CHIPSET(0x0A84, bxt, "Intel(R) HD Graphics (Broxton)") +CHIPSET(0x1A84, bxt, "Intel(R) HD Graphics (Broxton)") +CHIPSET(0x1A85, bxt_2x6, "Intel(R) HD Graphics (Broxton 2x6)") +CHIPSET(0x5A84, bxt, "Intel(R) HD Graphics 505 (Broxton)") +CHIPSET(0x5A85, bxt_2x6, "Intel(R) HD Graphics 500 (Broxton 2x6)") CHIPSET(0x5902, kbl_gt1, "Intel(R) Kabylake GT1") CHIPSET(0x5906, kbl_gt1, "Intel(R) Kabylake GT1") CHIPSET(0x590A, kbl_gt1, "Intel(R) Kabylake GT1") @@ -154,12 +163,3 @@ CHIPSET(0x5923, kbl_gt3, "Intel(R) Kabylake GT3") CHIPSET(0x5926, kbl_gt3, "Intel(R) Kabylake GT3") CHIPSET(0x5927, kbl_gt3, "Intel(R) Kabylake GT3") CHIPSET(0x593B, kbl_gt4, "Intel(R) Kabylake GT4") -CHIPSET(0x22B0, chv, "Intel(R) HD Graphics (Cherrytrail)") -CHIPSET(0x22B1, chv, "Intel(R) HD Graphics XXX (Braswell)") /* Overridden in brw_get_renderer_string */ -CHIPSET(0x22B2, chv, "Intel(R) HD Graphics (Cherryview)") -CHIPSET(0x22B3, chv, "Intel(R) HD Graphics (Cherryview)") -CHIPSET(0x0A84, bxt, "Intel(R) HD Graphics (Broxton)") -CHIPSET(0x1A84, bxt, "Intel(R) HD Graphics (Broxton)") -CHIPSET(0x1A85, bxt_2x6, "Intel(R) HD Graphics (Broxton 2x6)") -CHIPSET(0x5A84, bxt, "Intel(R) HD Graphics 505 (Broxton)") -CHIPSET(0x5A85, bxt_2x6, "Intel(R) HD Graphics 500 (Broxton 2x6)") -- 2.10.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] anv/radv: WSI sharing code
I've dug through the whole thing now. I'm not a fan of patch 21 (re-order UNORM and sRGB) and gave detailed comments on it. The rest are Reviewed-by: Jason EkstrandMy only other real comment is that I think I'd rather we put a bit more stuff in wsi_device so we're not passing so much around. In particular, the allocation functions and format functions could go there and maybe an alloc. If you wanted to clean that up as a follow-on patch, that's fine, but I would like it cleaned up if you don't mind. --Jason On Sun, Oct 16, 2016 at 9:24 PM, Dave Airlie wrote: > This series builds on top of the previous sharing patches I sent. > > The aim here is to share the X11 and wayland WSI code between > the two vulkan drivers so we have a consistent implementation and > one place to fix bugs. > > The series modifies the anv code in place until it's suitable > for sharing, then it moves it to shared directory, and ports > radv to use it. > > The final code leaves the WSI APIs in the drivers, but they > call directly into the shared code once they shed their driver > specific structs, and pick a pAllocator. > > Dave. > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 21/22] wsi: swap srgb/unorm around.
NAKish... I specifically put them in that order to *cause* talos to break. If we're going to support both UNORM and sRGB, then applications need to look at the formats they're getting and pick one intelligently rather than just using the first thing they find (which Talos does) especially if that app does their own gamma curvs. Apps that just grab the first thing they find probably "want" sRGB encoding done for them. I've talked to the guys at croteam and there is a Talos update in the pipe that fixes this. On Sun, Oct 16, 2016 at 9:24 PM, Dave Airliewrote: > From: Dave Airlie > > This prevents a Talos regression before radv > starts using shared WSI. > --- > src/vulkan/wsi/wsi_common_x11.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/src/vulkan/wsi/wsi_common_x11.c b/src/vulkan/wsi/wsi_common_ > x11.c > index b5832c6..241ef42 100644 > --- a/src/vulkan/wsi/wsi_common_x11.c > +++ b/src/vulkan/wsi/wsi_common_x11.c > @@ -135,8 +135,8 @@ wsi_x11_get_connection(struct wsi_device *wsi_dev, > } > > static const VkSurfaceFormatKHR formats[] = { > - { .format = VK_FORMAT_B8G8R8A8_SRGB, }, > { .format = VK_FORMAT_B8G8R8A8_UNORM, }, > + { .format = VK_FORMAT_B8G8R8A8_SRGB, }, > }; > > static const VkPresentModeKHR present_modes[] = { > -- > 2.5.5 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/3] [Bug 38970] [bisected]piglit glx/glx-pixmap-multi failed
On 18.10.2016 19:23, Ian Romanick wrote: On 09/29/2016 01:55 PM, Anutex wrote: I tried to debug this issue with changing the condition to check only bad magic and Error. And the test passed. Though i am not sure what is the correct behaviour if we are in this condition. May be we should make some other condition if the Hash Table have the bucket data. --- src/glx/dri2_glx.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/glx/dri2_glx.c b/src/glx/dri2_glx.c index af388d9..a1fd9ff 100644 --- a/src/glx/dri2_glx.c +++ b/src/glx/dri2_glx.c @@ -411,12 +411,13 @@ dri2CreateDrawable(struct glx_screen *base, XID xDrawable, return NULL; } - if (__glxHashInsert(pdp->dri2Hash, xDrawable, pdraw)) { + if (__glxHashInsert(pdp->dri2Hash, xDrawable, pdraw) == -1) { I'm not 100% sure the existing code is wrong. __glxHashInsert returns -1 for an error, and it returns 1 if the key is already in the hash table. In that case we'll leak the memory for the new pdraw, right? That also seems bad. It seems like instead the code should look up xDrawable in the hash table and return the value that's already there. Maybe. I haven't looked at this code in years, so I may be forgetting some subtlety. dri2DestroyDrawable destroys the pdraw though. It also removes the xDrawable entry in the hash table without checking whether it points at pdraw or not, so on the surface that looks pretty bogus if we create a GLXDrawable twice. _However_, the real question is what the hash is used for in the first place. It looks to me like the hash is actually pretty pointless in the pixmap case. And it just so happens that the GLX spec forbids creating a GLXDrawable from a Window twice, but it doesn't forbid creating a GLXDrawable from a Pixmap twice. Then again, my GLX knowledge is basically zero, so what do I know :) Nicolai (*psc->core->destroyDrawable) (pdraw->driDrawable); DRI2DestroyDrawable(psc->base.dpy, xDrawable); free(pdraw); return None; } + Spurious whitespace change. /* * Make sure server has the same swap interval we do for the new ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 14/22] anv/wsi: move further away from passing anv displays around
On Sun, Oct 16, 2016 at 9:24 PM, Dave Airliewrote: > From: Dave Airlie > > --- > src/intel/vulkan/anv_wsi.c | 28 +++- > src/intel/vulkan/anv_wsi.h | 3 ++- > src/intel/vulkan/anv_wsi_wayland.c | 21 +++-- > src/intel/vulkan/anv_wsi_x11.c | 22 +++--- > 4 files changed, 35 insertions(+), 39 deletions(-) > > diff --git a/src/intel/vulkan/anv_wsi.c b/src/intel/vulkan/anv_wsi.c > index 514a29f..89bf780 100644 > --- a/src/intel/vulkan/anv_wsi.c > +++ b/src/intel/vulkan/anv_wsi.c > @@ -253,17 +253,21 @@ VkResult anv_CreateSwapchainKHR( > struct anv_wsi_interface *iface = >device->instance->physicalDevice.wsi_device.wsi[surface->platform]; > struct anv_swapchain *swapchain; > + const VkAllocationCallbacks *alloc; > > - VkResult result = iface->create_swapchain(surface, device, > pCreateInfo, > - pAllocator, > _wsi_image_fns, > + if (pAllocator) > + alloc = pAllocator; > + else > + alloc = >alloc; > + VkResult result = iface->create_swapchain(surface, _device, > + >instance-> > physicalDevice.wsi_device, > + pCreateInfo, > + alloc, _wsi_image_fns, > ); > if (result != VK_SUCCESS) >return result; > > - if (pAllocator) > - swapchain->alloc = *pAllocator; > - else > - swapchain->alloc = device->alloc; > + swapchain->alloc = *alloc; > > for (unsigned i = 0; i < ARRAY_SIZE(swapchain->fences); i++) >swapchain->fences[i] = VK_NULL_HANDLE; > @@ -274,18 +278,24 @@ VkResult anv_CreateSwapchainKHR( > } > > void anv_DestroySwapchainKHR( > -VkDevice device, > +VkDevice _device, > VkSwapchainKHR _swapchain, > const VkAllocationCallbacks* pAllocator) > { > + ANV_FROM_HANDLE(anv_device, device, _device); > ANV_FROM_HANDLE(anv_swapchain, swapchain, _swapchain); > + const VkAllocationCallbacks *alloc; > > + if (pAllocator) > + alloc = pAllocator; > + else > + alloc = >alloc; > This isn't needed. The client is required to pass the same allocator in (if any) to this function as it does to Create. We can just use swapchain->alloc > for (unsigned i = 0; i < ARRAY_SIZE(swapchain->fences); i++) { >if (swapchain->fences[i] != VK_NULL_HANDLE) > - anv_DestroyFence(device, swapchain->fences[i], pAllocator); > + anv_DestroyFence(_device, swapchain->fences[i], pAllocator); > } > > - swapchain->destroy(swapchain, pAllocator); > + swapchain->destroy(swapchain, alloc); > } > > VkResult anv_GetSwapchainImagesKHR( > diff --git a/src/intel/vulkan/anv_wsi.h b/src/intel/vulkan/anv_wsi.h > index 2548e41..236133c 100644 > --- a/src/intel/vulkan/anv_wsi.h > +++ b/src/intel/vulkan/anv_wsi.h > @@ -60,7 +60,8 @@ struct anv_wsi_interface { > uint32_t* pPresentModeCount, > VkPresentModeKHR* pPresentModes); > VkResult (*create_swapchain)(VkIcdSurfaceBase *surface, > -struct anv_device *device, > +VkDevice device, > +struct anv_wsi_device *wsi_device, > const VkSwapchainCreateInfoKHR* > pCreateInfo, > const VkAllocationCallbacks* pAllocator, > const struct anv_wsi_image_fns *image_fns, > diff --git a/src/intel/vulkan/anv_wsi_wayland.c > b/src/intel/vulkan/anv_wsi_wayland.c > index e56b3be..16a9647 100644 > --- a/src/intel/vulkan/anv_wsi_wayland.c > +++ b/src/intel/vulkan/anv_wsi_wayland.c > @@ -422,14 +422,6 @@ wsi_wl_surface_get_present_modes(VkIcdSurfaceBase > *surface, > return VK_SUCCESS; > } > > -static VkResult > -wsi_wl_surface_create_swapchain(VkIcdSurfaceBase *surface, > -struct anv_device *device, > -const VkSwapchainCreateInfoKHR* > pCreateInfo, > -const VkAllocationCallbacks* pAllocator, > -const struct anv_wsi_image_fns *image_fns, > -struct anv_swapchain **swapchain); > - > VkResult anv_CreateWaylandSurfaceKHR( > VkInstance _instance, > const VkWaylandSurfaceCreateInfoKHR*pCreateInfo, > @@ -650,7 +642,7 @@ wsi_wl_swapchain_destroy(struct anv_swapchain > *anv_chain, > const VkAllocationCallbacks *pAllocator) > { > struct wsi_wl_swapchain *chain = (struct wsi_wl_swapchain *)anv_chain; > - struct anv_device *device =
Re: [Mesa-dev] [PATCH 06/22] anv/wsi/x11: push anv_device out of the init/finish routines
Feel free to shove an alloc in wsi_device. Might make some of this a bit simpler. I guess we usually shove one in wsi_implementation so it's not a big deal. On Sun, Oct 16, 2016 at 9:24 PM, Dave Airliewrote: > From: Dave Airlie > > --- > src/intel/vulkan/anv_wsi.c | 6 +++--- > src/intel/vulkan/anv_wsi.h | 6 -- > src/intel/vulkan/anv_wsi_x11.c | 22 -- > 3 files changed, 19 insertions(+), 15 deletions(-) > > diff --git a/src/intel/vulkan/anv_wsi.c b/src/intel/vulkan/anv_wsi.c > index 56ed3ec..767fa79 100644 > --- a/src/intel/vulkan/anv_wsi.c > +++ b/src/intel/vulkan/anv_wsi.c > @@ -31,7 +31,7 @@ anv_init_wsi(struct anv_physical_device *physical_device) > memset(physical_device->wsi_device.wsi, 0, > sizeof(physical_device->wsi_device.wsi)); > > #ifdef VK_USE_PLATFORM_XCB_KHR > - result = anv_x11_init_wsi(physical_device); > + result = anv_x11_init_wsi(_device->wsi_device, > _device->instance->alloc); > if (result != VK_SUCCESS) >return result; > #endif > @@ -40,7 +40,7 @@ anv_init_wsi(struct anv_physical_device *physical_device) > result = anv_wl_init_wsi(physical_device); > if (result != VK_SUCCESS) { > #ifdef VK_USE_PLATFORM_XCB_KHR > - anv_x11_finish_wsi(physical_device); > + anv_x11_finish_wsi(_device->wsi_device, > _device->instance->alloc); > #endif >return result; > } > @@ -56,7 +56,7 @@ anv_finish_wsi(struct anv_physical_device > *physical_device) > anv_wl_finish_wsi(physical_device); > #endif > #ifdef VK_USE_PLATFORM_XCB_KHR > - anv_x11_finish_wsi(physical_device); > + anv_x11_finish_wsi(_device->wsi_device, > _device->instance->alloc); > #endif > } > > diff --git a/src/intel/vulkan/anv_wsi.h b/src/intel/vulkan/anv_wsi.h > index 2bb8ee3..e1c8d02 100644 > --- a/src/intel/vulkan/anv_wsi.h > +++ b/src/intel/vulkan/anv_wsi.h > @@ -70,8 +70,10 @@ struct anv_swapchain { > ANV_DEFINE_NONDISP_HANDLE_CASTS(_VkIcdSurfaceBase, VkSurfaceKHR) > ANV_DEFINE_NONDISP_HANDLE_CASTS(anv_swapchain, VkSwapchainKHR) > > -VkResult anv_x11_init_wsi(struct anv_physical_device *physical_device); > -void anv_x11_finish_wsi(struct anv_physical_device *physical_device); > +VkResult anv_x11_init_wsi(struct anv_wsi_device *wsi_device, > + const VkAllocationCallbacks *alloc); > +void anv_x11_finish_wsi(struct anv_wsi_device *wsi_device, > +const VkAllocationCallbacks *alloc); > VkResult anv_wl_init_wsi(struct anv_physical_device *physical_device); > void anv_wl_finish_wsi(struct anv_physical_device *physical_device); > > diff --git a/src/intel/vulkan/anv_wsi_x11.c b/src/intel/vulkan/anv_wsi_ > x11.c > index 595c922..ccaabea 100644 > --- a/src/intel/vulkan/anv_wsi_x11.c > +++ b/src/intel/vulkan/anv_wsi_x11.c > @@ -897,12 +897,13 @@ fail_register: > } > > VkResult > -anv_x11_init_wsi(struct anv_physical_device *device) > +anv_x11_init_wsi(struct anv_wsi_device *wsi_device, > + const VkAllocationCallbacks *alloc) > { > struct wsi_x11 *wsi; > VkResult result; > > - wsi = vk_alloc(>instance->alloc, sizeof(*wsi), 8, > + wsi = vk_alloc(alloc, sizeof(*wsi), 8, > VK_SYSTEM_ALLOCATION_SCOPE_INSTANCE); > if (!wsi) { >result = vk_error(VK_ERROR_OUT_OF_HOST_MEMORY); > @@ -934,33 +935,34 @@ anv_x11_init_wsi(struct anv_physical_device *device) > wsi->base.get_present_modes = x11_surface_get_present_modes; > wsi->base.create_swapchain = x11_surface_create_swapchain; > > - device->wsi_device.wsi[VK_ICD_WSI_PLATFORM_XCB] = >base; > - device->wsi_device.wsi[VK_ICD_WSI_PLATFORM_XLIB] = >base; > + wsi_device->wsi[VK_ICD_WSI_PLATFORM_XCB] = >base; > + wsi_device->wsi[VK_ICD_WSI_PLATFORM_XLIB] = >base; > > return VK_SUCCESS; > > fail_mutex: > pthread_mutex_destroy(>mutex); > fail_alloc: > - vk_free(>instance->alloc, wsi); > + vk_free(alloc, wsi); > fail: > - device->wsi_device.wsi[VK_ICD_WSI_PLATFORM_XCB] = NULL; > - device->wsi_device.wsi[VK_ICD_WSI_PLATFORM_XLIB] = NULL; > + wsi_device->wsi[VK_ICD_WSI_PLATFORM_XCB] = NULL; > + wsi_device->wsi[VK_ICD_WSI_PLATFORM_XLIB] = NULL; > > return result; > } > > void > -anv_x11_finish_wsi(struct anv_physical_device *device) > +anv_x11_finish_wsi(struct anv_wsi_device *wsi_device, > + const VkAllocationCallbacks *alloc) > { > struct wsi_x11 *wsi = > - (struct wsi_x11 *)device->wsi_device.wsi[VK_ICD_WSI_PLATFORM_XCB]; > + (struct wsi_x11 *)wsi_device->wsi[VK_ICD_WSI_PLATFORM_XCB]; > > if (wsi) { >_mesa_hash_table_destroy(wsi->connections, NULL); > >pthread_mutex_destroy(>mutex); > > - vk_free(>instance->alloc, wsi); > + vk_free(alloc, wsi); > } > } > -- > 2.5.5 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev >
[Mesa-dev] [Bug 98308] llvmpipe crashes with glxgears
https://bugs.freedesktop.org/show_bug.cgi?id=98308 --- Comment #2 from Roland Scheidegger--- I'd be interested to know though why it fails, I don't think LTO should cause such failures? Seems like it might be related to the threads created by llvmpipe but I don't really see how. -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 98308] llvmpipe crashes with glxgears
https://bugs.freedesktop.org/show_bug.cgi?id=98308 Marc Dietrichchanged: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |NOTABUG --- Comment #1 from Marc Dietrich --- This crash was causes by using (unsupported) compilation with LTO. Sorry for the noise. -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 11/25] mesa/i965/i915/r200: eliminate gl_vertex_program
I'd like to see two tiny changes: 1. A comment for the IsPositionInvariant field that it can only be true for vertex programs. 2. An assertion or two like assert(p->Target == GL_VERTEX_PROGRAM_ARB || !p->IsPositionInvariant); in reasonable places. I'm thinking: - Where it's assigned in src/mesa/program/arbprogparse.c - Where it's used in src/mesa/state_tracker/st_program.c, src/mesa/drivers/dri/i965/brw_program.c, and src/mesa/tnl/t_vb_program.c (both places). I'd also support a follow-up patch that converts IsPositionInvariant from GLboolean to bool. :) On 10/17/2016 11:12 PM, Timothy Arceri wrote: > Here we move the only field in gl_vertex_program to the > ARB program fields in gl_program. > --- > src/mesa/drivers/common/meta.c | 10 +-- > src/mesa/drivers/common/meta.h | 2 +- > src/mesa/drivers/dri/i915/i915_fragprog.c| 4 +- > src/mesa/drivers/dri/i965/brw_context.h | 8 +-- > src/mesa/drivers/dri/i965/brw_curbe.c| 2 +- > src/mesa/drivers/dri/i965/brw_draw.c | 4 +- > src/mesa/drivers/dri/i965/brw_program.c | 5 +- > src/mesa/drivers/dri/i965/brw_vs.c | 41 ++-- > src/mesa/drivers/dri/i965/brw_vs_surface_state.c | 2 +- > src/mesa/drivers/dri/i965/gen6_vs_state.c| 4 +- > src/mesa/drivers/dri/r200/r200_context.h | 2 +- > src/mesa/drivers/dri/r200/r200_state_init.c | 4 +- > src/mesa/drivers/dri/r200/r200_tcl.c | 2 +- > src/mesa/drivers/dri/r200/r200_vertprog.c| 82 > > src/mesa/main/arbprogram.c | 19 +++--- > src/mesa/main/context.c | 8 +-- > src/mesa/main/ff_fragment_shader.cpp | 2 +- > src/mesa/main/ffvertex_prog.c| 72 ++--- > src/mesa/main/ffvertex_prog.h| 2 +- > src/mesa/main/mtypes.h | 17 ++--- > src/mesa/main/shared.c | 5 +- > src/mesa/main/state.c| 26 > src/mesa/main/state.h| 2 +- > src/mesa/program/arbprogparse.c | 46 ++--- > src/mesa/program/arbprogparse.h | 2 +- > src/mesa/program/prog_statevars.c| 8 +-- > src/mesa/program/program.c | 15 ++--- > src/mesa/program/program.h | 26 > src/mesa/program/programopt.c| 42 ++-- > src/mesa/program/programopt.h| 2 +- > src/mesa/state_tracker/st_atom.c | 4 +- > src/mesa/state_tracker/st_atom_constbuf.c| 2 +- > src/mesa/state_tracker/st_atom_rasterizer.c | 8 +-- > src/mesa/state_tracker/st_atom_sampler.c | 2 +- > src/mesa/state_tracker/st_atom_shader.c | 4 +- > src/mesa/state_tracker/st_atom_texture.c | 2 +- > src/mesa/state_tracker/st_cb_feedback.c | 2 +- > src/mesa/state_tracker/st_cb_program.c | 2 +- > src/mesa/state_tracker/st_debug.c| 4 +- > src/mesa/state_tracker/st_program.c | 35 +- > src/mesa/state_tracker/st_program.h | 4 +- > src/mesa/tnl/t_context.c | 4 +- > src/mesa/tnl/t_vb_program.c | 24 +++ > src/mesa/tnl/t_vp_build.c| 4 +- > src/mesa/vbo/vbo_exec_draw.c | 4 +- > src/mesa/vbo/vbo_save_draw.c | 4 +- > 46 files changed, 264 insertions(+), 311 deletions(-) > > diff --git a/src/mesa/drivers/common/meta.c b/src/mesa/drivers/common/meta.c > index 890e98a..ab81eed 100644 > --- a/src/mesa/drivers/common/meta.c > +++ b/src/mesa/drivers/common/meta.c > @@ -566,8 +566,8 @@ _mesa_meta_begin(struct gl_context *ctx, GLbitfield state) > >if (ctx->Extensions.ARB_vertex_program) { > save->VertexProgramEnabled = ctx->VertexProgram.Enabled; > - _mesa_reference_vertprog(ctx, >VertexProgram, > - ctx->VertexProgram.Current); > + _mesa_reference_program(ctx, >VertexProgram, > + ctx->VertexProgram.Current); > _mesa_set_enable(ctx, GL_VERTEX_PROGRAM_ARB, GL_FALSE); >} > > @@ -945,9 +945,9 @@ _mesa_meta_end(struct gl_context *ctx) >if (ctx->Extensions.ARB_vertex_program) { > _mesa_set_enable(ctx, GL_VERTEX_PROGRAM_ARB, >save->VertexProgramEnabled); > - _mesa_reference_vertprog(ctx, >VertexProgram.Current, > - save->VertexProgram); > - _mesa_reference_vertprog(ctx, >VertexProgram, NULL); > + _mesa_reference_program(ctx, >VertexProgram.Current, > + save->VertexProgram); > + _mesa_reference_program(ctx, >VertexProgram,
Re: [Mesa-dev] [PATCH] st/glsl_to_tgsi: sort input and output decls by TGSI index
Reviewed-by: Marek OlšákMarek On Tue, Oct 18, 2016 at 6:06 PM, Nicolai Hähnle wrote: > From: Nicolai Hähnle > > Fixes a regression introduced by commit 777dcf81b. > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98307 > -- > Using std::sort here is quite a bit C++-ier than most parts of Mesa. > I used it because the standard C library is being its usual lame self. > If people think using qsort_r is fine from a portability point of view > (it's a glibc-ism), I'd be happy to use that instead. > --- > src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 28 > 1 file changed, 28 insertions(+) > > diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp > b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp > index f49a873..406f4d5 100644 > --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp > +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp > @@ -48,20 +48,21 @@ > #include "tgsi/tgsi_ureg.h" > #include "tgsi/tgsi_info.h" > #include "util/u_math.h" > #include "util/u_memory.h" > #include "st_program.h" > #include "st_mesa_to_tgsi.h" > #include "st_format.h" > #include "st_glsl_types.h" > #include "st_nir.h" > > +#include > > #define PROGRAM_ANY_CONST ((1 << PROGRAM_STATE_VAR) |\ > (1 << PROGRAM_CONSTANT) | \ > (1 << PROGRAM_UNIFORM)) > > #define MAX_GLSL_TEXTURE_OFFSET 4 > > class st_src_reg; > class st_dst_reg; > > @@ -6092,20 +6093,43 @@ emit_compute_block_size(const struct gl_program > *program, >(const struct gl_compute_program *)program; > > ureg_property(ureg, TGSI_PROPERTY_CS_FIXED_BLOCK_WIDTH, > cp->LocalSize[0]); > ureg_property(ureg, TGSI_PROPERTY_CS_FIXED_BLOCK_HEIGHT, > cp->LocalSize[1]); > ureg_property(ureg, TGSI_PROPERTY_CS_FIXED_BLOCK_DEPTH, > cp->LocalSize[2]); > } > > +struct sort_inout_decls { > + bool operator()(const struct inout_decl , const struct inout_decl ) > const { > + return mapping[a.mesa_index] < mapping[b.mesa_index]; > + } > + > + const GLuint *mapping; > +}; > + > +/* Sort the given array of decls by the corresponding slot (TGSI file index). > + * > + * This is for the benefit of older drivers which are broken when the > + * declarations aren't sorted in this way. > + */ > +static void > +sort_inout_decls_by_slot(struct inout_decl *decls, > + unsigned count, > + const GLuint mapping[]) > +{ > + sort_inout_decls sorter; > + sorter.mapping = mapping; > + std::sort(decls, decls + count, sorter); > +} > + > /** > * Translate intermediate IR (glsl_to_tgsi_instruction) to TGSI format. > * \param program the program to translate > * \param numInputs number of input registers used > * \param inputMapping maps Mesa fragment program inputs to TGSI generic > * input indexes > * \param inputSemanticName the TGSI_SEMANTIC flag for each input > * \param inputSemanticIndex the semantic index (ex: which texcoord) for > *each input > * \param interpMode the TGSI_INTERPOLATE_LINEAR/PERSP mode for each input > @@ -6164,20 +6188,22 @@ st_translate_program( >calloc(t->num_temp_arrays, sizeof(t->arrays[0])); > > /* > * Declare input attributes. > */ > switch (procType) { > case PIPE_SHADER_FRAGMENT: > case PIPE_SHADER_GEOMETRY: > case PIPE_SHADER_TESS_EVAL: > case PIPE_SHADER_TESS_CTRL: > + sort_inout_decls_by_slot(program->inputs, program->num_inputs, > inputMapping); > + >for (i = 0; i < program->num_inputs; ++i) { > struct inout_decl *decl = >inputs[i]; > unsigned slot = inputMapping[decl->mesa_index]; > struct ureg_src src; > ubyte tgsi_usage_mask = decl->usage_mask; > > if (glsl_base_type_is_64bit(decl->base_type)) { > if (tgsi_usage_mask == 1) > tgsi_usage_mask = TGSI_WRITEMASK_XY; > else if (tgsi_usage_mask == 2) > @@ -6216,20 +6242,22 @@ st_translate_program( > * Declare output attributes. > */ > switch (procType) { > case PIPE_SHADER_FRAGMENT: > case PIPE_SHADER_COMPUTE: >break; > case PIPE_SHADER_GEOMETRY: > case PIPE_SHADER_TESS_EVAL: > case PIPE_SHADER_TESS_CTRL: > case PIPE_SHADER_VERTEX: > + sort_inout_decls_by_slot(program->outputs, program->num_outputs, > outputMapping); > + >for (i = 0; i < program->num_outputs; ++i) { > struct inout_decl *decl = >outputs[i]; > unsigned slot = outputMapping[decl->mesa_index]; > struct ureg_dst dst; > ubyte tgsi_usage_mask = decl->usage_mask; > > if (glsl_base_type_is_64bit(decl->base_type)) { > if (tgsi_usage_mask == 1) > tgsi_usage_mask = TGSI_WRITEMASK_XY;
Re: [Mesa-dev] [PATCH 20/22] anv: move to using shared wsi code
Hi Dave, Thanks for doing this. It'll be great to get an Ack from the Intel devs, on the idea. Afaics with 22/22 in place you can drop the vk_alloc2/vk_free2 functions since they are no longer used. Just an extra (small) suggestion below: On 17 October 2016 at 05:24, Dave Airliewrote: > delete mode 100644 src/intel/vulkan/wsi_common.h > delete mode 100644 src/intel/vulkan/wsi_common_wayland.c > delete mode 100644 src/intel/vulkan/wsi_common_wayland.h > delete mode 100644 src/intel/vulkan/wsi_common_x11.c > delete mode 100644 src/intel/vulkan/wsi_common_x11.h > create mode 100644 src/vulkan/wsi/Makefile.am > create mode 100644 src/vulkan/wsi/Makefile.sources > create mode 100644 src/vulkan/wsi/wsi_common.h > create mode 100644 src/vulkan/wsi/wsi_common_wayland.c > create mode 100644 src/vulkan/wsi/wsi_common_wayland.h > create mode 100644 src/vulkan/wsi/wsi_common_x11.c > create mode 100644 src/vulkan/wsi/wsi_common_x11.h > Can you use git format-patch -M (or $git config --global diff.renames true) so that the diff is friendlier. > diff --git a/configure.ac b/configure.ac > index 37cc306..688459b 100644 > --- a/configure.ac > +++ b/configure.ac > @@ -2854,7 +2854,8 @@ AC_CONFIG_FILES([Makefile > src/mesa/main/tests/Makefile > src/util/Makefile > src/util/tests/hash_table/Makefile > - src/vulkan/Makefile]) > + src/vulkan/Makefile > + src/vulkan/wsi/Makefile]) > Just fold the new Makefile into the existing one ? In should be as simple as adding wsi/ prefix to files. Alternatively we can do that as a follow-up. Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 07/11] vulkan: add vk_alloc.h shared allocation inlines.
We already talked on IRC about putting vk_alloc.h in src/util. Assuming that's done, the series is Acked-by: Jason EkstrandPlease make sure you do a fairly complete (fedora config?) build test. I don't want those MIN/MAX macros to cause problems. --Jason On Sun, Oct 16, 2016 at 7:07 PM, Dave Airlie wrote: > From: Dave Airlie > > vulkan allocation allows for overriding the allocator used, > add some macros for anv/radv to share for this. > > Signed-off-by: Dave Airlie > --- > configure.ac | 5 ++- > src/Makefile.am | 4 +++ > src/vulkan/Makefile.am | 26 +++ > src/vulkan/Makefile.sources | 2 ++ > src/vulkan/common/vk_alloc.h | 75 ++ > ++ > 5 files changed, 111 insertions(+), 1 deletion(-) > create mode 100644 src/vulkan/Makefile.am > create mode 100644 src/vulkan/Makefile.sources > create mode 100644 src/vulkan/common/vk_alloc.h > > diff --git a/configure.ac b/configure.ac > index b414edd..37cc306 100644 > --- a/configure.ac > +++ b/configure.ac > @@ -2693,6 +2693,8 @@ VA_MINOR=`$PKG_CONFIG --modversion libva | $SED -n > 's/.*\.\(.*\)\..*$/\1/p'` > AC_SUBST([VA_MAJOR], $VA_MAJOR) > AC_SUBST([VA_MINOR], $VA_MINOR) > > +AM_CONDITIONAL(HAVE_VULKAN_COMMON, test "x$VULKAN_DRIVERS" != "x") > + > AC_SUBST([XVMC_MAJOR], 1) > AC_SUBST([XVMC_MINOR], 0) > > @@ -2851,7 +2853,8 @@ AC_CONFIG_FILES([Makefile > src/mesa/drivers/x11/Makefile > src/mesa/main/tests/Makefile > src/util/Makefile > - src/util/tests/hash_table/Makefile]) > + src/util/tests/hash_table/Makefile > + src/vulkan/Makefile]) > > AC_OUTPUT > > diff --git a/src/Makefile.am b/src/Makefile.am > index 17c8798..10e0826 100644 > --- a/src/Makefile.am > +++ b/src/Makefile.am > @@ -74,6 +74,10 @@ endif > # include only conditionally ? > SUBDIRS += compiler > > +if HAVE_VULKAN_COMMON > +SUBDIRS += vulkan > +endif > + > if HAVE_AMD_DRIVERS > SUBDIRS += amd > endif > diff --git a/src/vulkan/Makefile.am b/src/vulkan/Makefile.am > new file mode 100644 > index 000..abe8404 > --- /dev/null > +++ b/src/vulkan/Makefile.am > @@ -0,0 +1,26 @@ > +# Copyright © 2016 Red Hat. > +# > +# Permission is hereby granted, free of charge, to any person obtaining a > +# copy of this software and associated documentation files (the > "Software"), > +# to deal in the Software without restriction, including without > limitation > +# the rights to use, copy, modify, merge, publish, distribute, sublicense, > +# and/or sell copies of the Software, and to permit persons to whom the > +# Software is furnished to do so, subject to the following conditions: > +# > +# The above copyright notice and this permission notice (including the > next > +# paragraph) shall be included in all copies or substantial portions of > the > +# Software. > +# > +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS > OR > +# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, > +# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL > +# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR > OTHER > +# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING > +# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER > DEALINGS > +# IN THE SOFTWARE. > + > +include Makefile.sources > + > +noinst_LTLIBRARIES = > + > +EXTRA_DIST = $(COMMON_HEADER_FILES) > diff --git a/src/vulkan/Makefile.sources b/src/vulkan/Makefile.sources > new file mode 100644 > index 000..a73bf99 > --- /dev/null > +++ b/src/vulkan/Makefile.sources > @@ -0,0 +1,2 @@ > +COMMON_HEADER_FILES = \ > + common/vk_alloc.h > diff --git a/src/vulkan/common/vk_alloc.h b/src/vulkan/common/vk_alloc.h > new file mode 100644 > index 000..a8e21ca > --- /dev/null > +++ b/src/vulkan/common/vk_alloc.h > @@ -0,0 +1,75 @@ > +/* > + * Copyright © 2015 Intel Corporation > + * > + * Permission is hereby granted, free of charge, to any person obtaining a > + * copy of this software and associated documentation files (the > "Software"), > + * to deal in the Software without restriction, including without > limitation > + * the rights to use, copy, modify, merge, publish, distribute, > sublicense, > + * and/or sell copies of the Software, and to permit persons to whom the > + * Software is furnished to do so, subject to the following conditions: > + * > + * The above copyright notice and this permission notice (including the > next > + * paragraph) shall be included in all copies or substantial portions of > the > + * Software. > + * > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, > EXPRESS OR > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF > MERCHANTABILITY, > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT
Re: [Mesa-dev] [PATCH] glsl: optimize list handling in opt_dead_code
On Tue, Oct 18, 2016 at 8:04 PM, Marek Olšákwrote: > On Tue, Oct 18, 2016 at 7:12 PM, Jan Ziak <0xe2.0x9a.0...@gmail.com> > wrote: > >> Regarding C++ templates, the compiler doesn't use them. If u_vector > >> (Dave Airlie?) provides the same functionality as your array, I > >> suggest we use u_vector instead. > > > > Let me repeat what you just wrote, because it is unbelievable: You are > > advising the use of non-templated collection types in C++ code. > > Absolutely. > I don't believe what my own eyes are seeing. > > If it isn't merged by Thursday (2016-oct-20) I will mark it as > > rejected (rejected based on personal rather than scientific grounds). > > Relax. Things tend to move slowly when people are on conferences, > vacations, or just busy with corporate stuff they have to deal with > every day etc. and you can't predict those. > > Marek > Ok. Let's relax. Jan ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 04/11] util: move min/max/clamp macros to util macros.h
THANK YOU! I've been wanting to see this happen for a long time. On Sun, Oct 16, 2016 at 7:07 PM, Dave Airliewrote: > From: Dave Airlie > > Although the vulkan drivers include mesa macros.h, for > radv I'd like to move away from that. > > Signed-off-by: Dave Airlie > --- > src/mesa/main/macros.h | 13 - > src/util/macros.h | 13 + > 2 files changed, 13 insertions(+), 13 deletions(-) > > diff --git a/src/mesa/main/macros.h b/src/mesa/main/macros.h > index ed207d4..03a228b 100644 > --- a/src/mesa/main/macros.h > +++ b/src/mesa/main/macros.h > @@ -660,19 +660,6 @@ INTERP_4F(GLfloat t, GLfloat dst[4], const GLfloat > out[4], const GLfloat in[4]) > > > > -/** Clamp X to [MIN,MAX] */ > -#define CLAMP( X, MIN, MAX ) ( (X)<(MIN) ? (MIN) : ((X)>(MAX) ? (MAX) : > (X)) ) > - > -/** Minimum of two values: */ > -#define MIN2( A, B ) ( (A)<(B) ? (A) : (B) ) > - > -/** Maximum of two values: */ > -#define MAX2( A, B ) ( (A)>(B) ? (A) : (B) ) > - > -/** Minimum and maximum of three values: */ > -#define MIN3( A, B, C ) ((A) < (B) ? MIN2(A, C) : MIN2(B, C)) > -#define MAX3( A, B, C ) ((A) > (B) ? MAX2(A, C) : MAX2(B, C)) > - > static inline unsigned > minify(unsigned value, unsigned levels) > { > diff --git a/src/util/macros.h b/src/util/macros.h > index 9dea2a0..27d1b62 100644 > --- a/src/util/macros.h > +++ b/src/util/macros.h > @@ -229,4 +229,17 @@ do { \ > /** Compute ceiling of integer quotient of A divided by B. */ > #define DIV_ROUND_UP( A, B ) ( (A) % (B) == 0 ? (A)/(B) : (A)/(B)+1 ) > > +/** Clamp X to [MIN,MAX] */ > +#define CLAMP( X, MIN, MAX ) ( (X)<(MIN) ? (MIN) : ((X)>(MAX) ? (MAX) : > (X)) ) > + > +/** Minimum of two values: */ > +#define MIN2( A, B ) ( (A)<(B) ? (A) : (B) ) > + > +/** Maximum of two values: */ > +#define MAX2( A, B ) ( (A)>(B) ? (A) : (B) ) > + > +/** Minimum and maximum of three values: */ > +#define MIN3( A, B, C ) ((A) < (B) ? MIN2(A, C) : MIN2(B, C)) > +#define MAX3( A, B, C ) ((A) > (B) ? MAX2(A, C) : MAX2(B, C)) > + > #endif /* UTIL_MACROS_H */ > -- > 2.5.5 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] glsl: optimize list handling in opt_dead_code
Perf stat results for shader-db: This is measured on an AMD Kaveri CPU. gcc-6.2.0 -fno-omit-frame-pointer -g -O2 Unpatched: $ cd shader-db $ ../run-upstream perfstat-u --repeat=5 -- ./run -1 shaders >/dev/null Performance counter stats for './run -1 shaders' (5 runs): 13689.962374 task-clock (msec) #1.000 CPUs utilized ( +- 0.29% ) 138 context-switches #0.010 K/sec ( +- 17.82% ) 6 cpu-migrations#0.000 K/sec ( +- 13.36% ) 78,559 page-faults #0.006 M/sec ( +- 0.24% ) 53,578,642,861 cycles:u #3.914 GHz ( +- 0.29% ) 44,813,859,985 instructions:u#0.84 insn per cycle ( +- 0.01% ) 1,069,586,875 cache-references:u# 78.129 M/sec ( +- 0.65% ) 51,295,256 cache-misses:u#4.796 % of all cache refs ( +- 0.56% ) 9,508,996,305 branches:u# 694.596 M/sec ( +- 0.01% ) 453,237,236 branch-misses:u #4.77% of all branches ( +- 0.84% ) 13.692494394 seconds time elapsed ( +- 0.29% ) Patched: $ cd shader-db $ ../run-upstream-patched perfstat-u --repeat=5 -- ./run -1 shaders >/dev/null Performance counter stats for './run -1 shaders' (5 runs): 13602.106171 task-clock (msec) #1.000 CPUs utilized ( +- 0.14% ) 86 context-switches #0.006 K/sec ( +- 13.95% ) 6 cpu-migrations#0.000 K/sec ( +- 26.35% ) 78,271 page-faults #0.006 M/sec ( +- 0.82% ) 53,299,046,681 cycles:u #3.918 GHz ( +- 0.13% ) 44,577,707,063 instructions:u#0.84 insn per cycle ( +- 0.01% ) 1,078,158,307 cache-references:u# 79.264 M/sec ( +- 0.70% ) 51,521,287 cache-misses:u#4.779 % of all cache refs ( +- 1.03% ) 9,459,962,609 branches:u# 695.478 M/sec ( +- 0.01% ) 456,593,871 branch-misses:u #4.83% of all branches ( +- 0.27% ) 13.603795247 seconds time elapsed ( +- 0.14% ) ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 18/22] anv: move common wsi code to x11/wayland common files.
On 17 October 2016 at 05:24, Dave Airliewrote: > diff --git a/src/intel/vulkan/Makefile.sources > b/src/intel/vulkan/Makefile.sources > index 85df8a5..bd3afc0 100644 > --- a/src/intel/vulkan/Makefile.sources > +++ b/src/intel/vulkan/Makefile.sources > @@ -43,14 +43,17 @@ VULKAN_FILES := \ > anv_util.c \ > anv_wsi.c \ > anv_wsi.h \ > + wsi_common.h \ > genX_pipeline_util.h \ > vk_format_info.h > > VULKAN_WSI_WAYLAND_FILES := \ > - anv_wsi_wayland.c > + anv_wsi_wayland.c \ > + wsi_common_wayland.c > > VULKAN_WSI_X11_FILES := \ > - anv_wsi_x11.c > + anv_wsi_x11.c \ > + wsi_common_x11.c Please include the relevant headers in the lists above. Also do copy the license from the current source. Obviously you can add yourself/Redhat if interested. -Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] svga: minor code improvements in svga_validate_pipe_sampler_view()
Reviewed-by: Charmaine LeeFrom: Brian Paul Sent: Tuesday, October 18, 2016 9:36 AM To: mesa-dev@lists.freedesktop.org Cc: Charmaine Lee Subject: [PATCH] svga: minor code improvements in svga_validate_pipe_sampler_view() Use the 'texture' local var in more places. Rename 'pFormat' to 'viewFormat'. --- src/gallium/drivers/svga/svga_state_sampler.c | 16 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/src/gallium/drivers/svga/svga_state_sampler.c b/src/gallium/drivers/svga/svga_state_sampler.c index 53bb80f..445afcc 100644 --- a/src/gallium/drivers/svga/svga_state_sampler.c +++ b/src/gallium/drivers/svga/svga_state_sampler.c @@ -135,21 +135,21 @@ svga_validate_pipe_sampler_view(struct svga_context *svga, SVGA3dSurfaceFormat format; SVGA3dResourceType resourceDim; SVGA3dShaderResourceViewDesc viewDesc; - enum pipe_format pformat = sv->base.format; + enum pipe_format viewFormat = sv->base.format; /* vgpu10 cannot create a BGRX view for a BGRA resource, so force it to * create a BGRA view (and vice versa). */ - if (pformat == PIPE_FORMAT_B8G8R8X8_UNORM && - sv->base.texture->format == PIPE_FORMAT_B8G8R8A8_UNORM) { - pformat = PIPE_FORMAT_B8G8R8A8_UNORM; + if (viewFormat == PIPE_FORMAT_B8G8R8X8_UNORM && + texture->format == PIPE_FORMAT_B8G8R8A8_UNORM) { + viewFormat = PIPE_FORMAT_B8G8R8A8_UNORM; } - else if (pformat == PIPE_FORMAT_B8G8R8A8_UNORM && - sv->base.texture->format == PIPE_FORMAT_B8G8R8X8_UNORM) { - pformat = PIPE_FORMAT_B8G8R8X8_UNORM; + else if (viewFormat == PIPE_FORMAT_B8G8R8A8_UNORM && + texture->format == PIPE_FORMAT_B8G8R8X8_UNORM) { + viewFormat = PIPE_FORMAT_B8G8R8X8_UNORM; } - format = svga_translate_format(ss, pformat, + format = svga_translate_format(ss, viewFormat, PIPE_BIND_SAMPLER_VIEW); assert(format != SVGA3D_FORMAT_INVALID); -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] glsl: optimize list handling in opt_dead_code
On Tue, Oct 18, 2016 at 7:12 PM, Jan Ziak <0xe2.0x9a.0...@gmail.com> wrote: >> Regarding C++ templates, the compiler doesn't use them. If u_vector >> (Dave Airlie?) provides the same functionality as your array, I >> suggest we use u_vector instead. > > Let me repeat what you just wrote, because it is unbelievable: You are > advising the use of non-templated collection types in C++ code. Absolutely. > >> If you can't use u_vector, you should >> ask for approval from GLSL compiler leads (e.g. Ian Romanick or >> Kenneth Graunke) to use C++ templates. > > - You are talking about coding rules some Mesa developers agreed upon > and didn't bother writing down for other developers to read > > - I am not willing to use u_vector in C++ code > >> I'll repeat some stuff about profiling here but also explain my perspective. > > So far (which may be a year or so), there is no indication that you > are better at optimizing code than me. Good one. > >> Never profile with -O0 or disabled function inlining. > > Seriously? Absolutely. > >> Mesa uses -g -O2 >> with --enable-debug, so that's what you should use too. Don't use any >> other -O* variants. > > What if I find a case where -O2 prevents me from easily seeing > information necessary to optimize the source code? There are several ways to get useful data from optimized code (using the frame pointer, using dwarf, etc.) -O0 is too distorted. > >> The only profiling tools reporting correct results are perf and >> sysprof. > > I used perf on Metro 2033 Redux and saw do_dead_code() there. Then I > used callgrind to see some more code. I recommend building Mesa with the frame pointer enabled, or enabling dwarf in perf. Otherwise you won't see call trees. > >> (both use the same mechanism) If you don't enable dwarf in >> perf (also sysprof can't use dwarf), you have to build Mesa with >> -fno-omit-frame-pointer to see call trees. The only reason you would >> want to enable dwarf-based call trees is when you want to see libc >> calls. Otherwise, they won't be displayed or counted as part of call >> trees. For Mesa developers who do profiling often, >> -fno-omit-frame-pointer should be your default. > >> Callgrind counts calls (that one you can trust), but the reported time >> is incorrect, > > Are you nuts? You cannot be seriously be assuming that I didn't know about > that. > >> because it uses its own virtual model of a CPU. Avoid it >> if you want to measure time spent in functions. > > I will *NOT* avoid callgrind because I know how to use it to optimize code. I didn't suggest avoiding callgrind in all cases. > >>Marek > > As usual, I would like to notify reviewers of this path that I > am not willing to wait months to learn whether the code will be merged > or rejected. > > If it isn't merged by Thursday (2016-oct-20) I will mark it as > rejected (rejected based on personal rather than scientific grounds). Relax. Things tend to move slowly when people are on conferences, vacations, or just busy with corporate stuff they have to deal with every day etc. and you can't predict those. Marek ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] nv50/ir: silent TGSI_PROPERTY_FS_DEPTH_LAYOUT
Reviewed-by: Ilia MirkinThis comes into play with Zcull, I think. But since we don't do Zcull yet, wtvr. I had a patch to convert it into a layout(early_fragment_tests) effectively if the various settings matched, but ultimately it didn't seem worthwhile. -ilia On Tue, Oct 18, 2016 at 1:59 PM, Samuel Pitoiset wrote: > Found that information message while replaying a trace from > Metro 2033 Redux. Mark that property as useless for now. > > Signed-off-by: Samuel Pitoiset > --- > src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp > b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp > index db03281..0c98744 100644 > --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp > +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp > @@ -1093,6 +1093,7 @@ void Source::scanProperty(const struct > tgsi_full_property *prop) >break; > case TGSI_PROPERTY_FS_COORD_ORIGIN: > case TGSI_PROPERTY_FS_COORD_PIXEL_CENTER: > + case TGSI_PROPERTY_FS_DEPTH_LAYOUT: >// we don't care >break; > case TGSI_PROPERTY_VS_PROHIBIT_UCPS: > -- > 2.10.0 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] nv50/ir: Split 64-bit integer MAD/MUL operations
Hello Ian, Since I am working on a direct SPIR-V to NV50 IR translator, ultimately to be used for OpenCL kernels, I will still need the patch for that work. (I even wrote that patch because I needed it when handling 64-bit addresses. :-) ) But thanks for the heads-up! Pierre On 02:07 pm - Oct 17 2016, Ian Romanick wrote: > I know know if it will make this patch unnecessary, but I have a GLSL > IR-level lowering pass for 64-bit multiplication. I'm going to send > that out with the rest of the GL_ARB_gpu_shader_int64 series within the > next day or so. > > On 10/15/2016 03:24 PM, Pierre Moreau wrote: > > Hardware does not support 64-bit integers MAD and MUL operations, so we need > > to transform them in 32-bit operations. > > > > Signed-off-by: Pierre Moreau> > --- > > .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 121 > > + > > 1 file changed, 121 insertions(+) > > > > Tested with (the GPU result was compared to the CPU result): > > * 0xfff3lu * 0xfff2lu + 0x80070002lu > > * 0xfff3lu * 0x80070002lu + 0x80070002lu > > * 0x80010003lu * 0xfff2lu + 0x80070002lu > > * 0x80010003lu * 0x80070002lu + 0x80070002lu > > > > * -523456791234l * 929835793793l + -15793793l > > * 523456791234l * 929835793793l + -15793793l > > * -523456791234l * -929835793793l + -15793793l > > * 523456791234l * -929835793793l + -15793793l > > > > v2: > > * Completely re-write the patch, as it was completely flawed (Ilia Mirkin) > > * Move pass prior to Register Allocation, as some temporaries need to > > be created. > > > > diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp > > b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp > > index d88bb34..a610eb5 100644 > > --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp > > +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp > > @@ -2218,6 +2218,126 @@ LateAlgebraicOpt::visit(Instruction *i) > > > > // > > = > > > > +// Split 64-bit MUL and MAD > > +class Split64BitOpPreRA : public Pass > > +{ > > +private: > > + virtual bool visit(BasicBlock *); > > + void split64BitReg(Function *, Instruction *, Instruction *, > > + Instruction *, Value *, int); > > + void split64MulMad(Function *, Instruction *, DataType); > > + > > + BuildUtil bld; > > +}; > > + > > +bool > > +Split64BitOpPreRA::visit(BasicBlock *bb) > > +{ > > + Instruction *i, *next; > > + Modifier mod; > > + > > + for (i = bb->getEntry(); i; i = next) { > > + next = i->next; > > + > > + if (typeSizeof(i->dType) != 8) > > + continue; > > + > > + DataType hTy; > > + switch (i->dType) { > > + case TYPE_U64: hTy = TYPE_U32; break; > > + case TYPE_S64: hTy = TYPE_S32; break; > > + default: > > + continue; > > + } > > + > > + if (i->op == OP_MAD || i->op == OP_MUL) > > + split64MulMad(bb->getFunction(), i, hTy); > > + } > > + > > + return true; > > +} > > + > > +void > > +Split64BitOpPreRA::split64MulMad(Function *fn, Instruction *i, DataType > > hTy) > > +{ > > + assert(i->op == OP_MAD || i->op == OP_MUL); > > + if (isFloatType(i->dType) || isFloatType(i->sType)) > > + return; > > + > > + bld.setPosition(i, true); > > + > > + Value *zero = bld.mkImm(0u); > > + Value *carry = bld.getSSA(1, FILE_FLAGS); > > + > > + // We want to compute `d = a * b (+ c)?`, where a, b, c and d are 64-bit > > + // values (a, b and c might be 32-bit values), using 32-bit operations. > > This > > + // gives the following operations: > > + // * `d.low = low(a.low * b.low) (+ c.low)?` > > + // * `d.high = low(a.high * b.low) + low(a.low * b.high) > > + // + high(a.low * b.low) (+ c.high)?` > > + // > > + // To compute the high bits, we can split in the following operations: > > + // * `tmp1 = low(a.high * b.low) (+ c.high)?` > > + // * `tmp2 = low(a.low * b.high) + tmp1` > > + // * `d.high = high(a.low * b.low) + tmp2` > > + // > > + // mkSplit put lower bits at index 0 and higher bits at index 1 > > + > > + Value *op1[2]; > > + if (i->getSrc(0)->reg.size == 8) > > + bld.mkSplit(op1, typeSizeof(hTy), i->getSrc(0)); > > + else { > > + op1[0] = i->getSrc(0); > > + op1[1] = zero; > > + } > > + Value *op2[2]; > > + if (i->getSrc(1)->reg.size == 8) > > + bld.mkSplit(op2, typeSizeof(hTy), i->getSrc(1)); > > + else { > > + op2[0] = i->getSrc(1); > > + op2[1] = zero; > > + } > > + > > + Value *op3[2] = { NULL, NULL }; > > + if (i->op == OP_MAD) { > > + if (i->getSrc(2)->reg.size == 8) > > + bld.mkSplit(op3, typeSizeof(hTy), i->getSrc(2)); > > + else { > > + op3[0] = i->getSrc(2); > > + op3[1] = zero; > > +
[Mesa-dev] [PATCH] nv50/ir: silent TGSI_PROPERTY_FS_DEPTH_LAYOUT
Found that information message while replaying a trace from Metro 2033 Redux. Mark that property as useless for now. Signed-off-by: Samuel Pitoiset--- src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp | 1 + 1 file changed, 1 insertion(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp index db03281..0c98744 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp @@ -1093,6 +1093,7 @@ void Source::scanProperty(const struct tgsi_full_property *prop) break; case TGSI_PROPERTY_FS_COORD_ORIGIN: case TGSI_PROPERTY_FS_COORD_PIXEL_CENTER: + case TGSI_PROPERTY_FS_DEPTH_LAYOUT: // we don't care break; case TGSI_PROPERTY_VS_PROHIBIT_UCPS: -- 2.10.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] st/mesa: disable alpha-test, alpha-to-coverage, alpha-to-one for integer FBs
Reviewed-by: Brian PaulOn 10/18/2016 11:48 AM, Marek Olšák wrote: From: Marek Olšák v2: rebased --- src/mesa/state_tracker/st_atom_blend.c | 3 ++- src/mesa/state_tracker/st_atom_depth.c | 3 ++- 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/src/mesa/state_tracker/st_atom_blend.c b/src/mesa/state_tracker/st_atom_blend.c index 76d6a644..b8d65bd 100644 --- a/src/mesa/state_tracker/st_atom_blend.c +++ b/src/mesa/state_tracker/st_atom_blend.c @@ -259,21 +259,22 @@ update_blend( struct st_context *st ) blend->rt[i].colormask |= PIPE_MASK_G; if (ctx->Color.ColorMask[i][2]) blend->rt[i].colormask |= PIPE_MASK_B; if (ctx->Color.ColorMask[i][3]) blend->rt[i].colormask |= PIPE_MASK_A; } blend->dither = ctx->Color.DitherFlag; if (ctx->Multisample.Enabled && - ctx->DrawBuffer->Visual.sampleBuffers > 0) { + ctx->DrawBuffer->Visual.sampleBuffers > 0 && + !(ctx->DrawBuffer->_IntegerBuffers & 0x1)) { /* Unlike in gallium/d3d10 these operations are only performed * if both msaa is enabled and we have a multisample buffer. */ blend->alpha_to_coverage = ctx->Multisample.SampleAlphaToCoverage; blend->alpha_to_one = ctx->Multisample.SampleAlphaToOne; } cso_set_blend(st->cso_context, blend); { diff --git a/src/mesa/state_tracker/st_atom_depth.c b/src/mesa/state_tracker/st_atom_depth.c index 267b42c..7092c3f 100644 --- a/src/mesa/state_tracker/st_atom_depth.c +++ b/src/mesa/state_tracker/st_atom_depth.c @@ -142,21 +142,22 @@ update_depth_stencil_alpha(struct st_context *st) else { /* This should be unnecessary. Drivers must not expect this to * contain valid data, except the enabled bit */ dsa->stencil[1] = dsa->stencil[0]; dsa->stencil[1].enabled = 0; sr.ref_value[1] = sr.ref_value[0]; } } - if (ctx->Color.AlphaEnabled) { + if (ctx->Color.AlphaEnabled && + !(ctx->DrawBuffer->_IntegerBuffers & 0x1)) { dsa->alpha.enabled = 1; dsa->alpha.func = st_compare_func_to_pipe(ctx->Color.AlphaFunc); dsa->alpha.ref_value = ctx->Color.AlphaRefUnclamped; } cso_set_depth_stencil_alpha(st->cso_context, dsa); cso_set_stencil_ref(st->cso_context, ); } ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] st/mesa: disable alpha-test, alpha-to-coverage, alpha-to-one for integer FBs
From: Marek Olšákv2: rebased --- src/mesa/state_tracker/st_atom_blend.c | 3 ++- src/mesa/state_tracker/st_atom_depth.c | 3 ++- 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/src/mesa/state_tracker/st_atom_blend.c b/src/mesa/state_tracker/st_atom_blend.c index 76d6a644..b8d65bd 100644 --- a/src/mesa/state_tracker/st_atom_blend.c +++ b/src/mesa/state_tracker/st_atom_blend.c @@ -259,21 +259,22 @@ update_blend( struct st_context *st ) blend->rt[i].colormask |= PIPE_MASK_G; if (ctx->Color.ColorMask[i][2]) blend->rt[i].colormask |= PIPE_MASK_B; if (ctx->Color.ColorMask[i][3]) blend->rt[i].colormask |= PIPE_MASK_A; } blend->dither = ctx->Color.DitherFlag; if (ctx->Multisample.Enabled && - ctx->DrawBuffer->Visual.sampleBuffers > 0) { + ctx->DrawBuffer->Visual.sampleBuffers > 0 && + !(ctx->DrawBuffer->_IntegerBuffers & 0x1)) { /* Unlike in gallium/d3d10 these operations are only performed * if both msaa is enabled and we have a multisample buffer. */ blend->alpha_to_coverage = ctx->Multisample.SampleAlphaToCoverage; blend->alpha_to_one = ctx->Multisample.SampleAlphaToOne; } cso_set_blend(st->cso_context, blend); { diff --git a/src/mesa/state_tracker/st_atom_depth.c b/src/mesa/state_tracker/st_atom_depth.c index 267b42c..7092c3f 100644 --- a/src/mesa/state_tracker/st_atom_depth.c +++ b/src/mesa/state_tracker/st_atom_depth.c @@ -142,21 +142,22 @@ update_depth_stencil_alpha(struct st_context *st) else { /* This should be unnecessary. Drivers must not expect this to * contain valid data, except the enabled bit */ dsa->stencil[1] = dsa->stencil[0]; dsa->stencil[1].enabled = 0; sr.ref_value[1] = sr.ref_value[0]; } } - if (ctx->Color.AlphaEnabled) { + if (ctx->Color.AlphaEnabled && + !(ctx->DrawBuffer->_IntegerBuffers & 0x1)) { dsa->alpha.enabled = 1; dsa->alpha.func = st_compare_func_to_pipe(ctx->Color.AlphaFunc); dsa->alpha.ref_value = ctx->Color.AlphaRefUnclamped; } cso_set_depth_stencil_alpha(st->cso_context, dsa); cso_set_stencil_ref(st->cso_context, ); } -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] egl/surfaceless: Fix segfault in eglSwapBuffers
On Tue, Oct 18, 2016 at 9:43 AM, Chad Versacewrote: > Since commit 63c5d5c6c46c8472ee7a8241a0f80f13d79cb8cd, the surfaceless > platform has allowed creation of pbuffer surfaces. But the vtable entry > for eglSwapBuffers has remained NULL. > > Discovered by running a little pbuffer test. > > Cc: Gurchetan Singh > --- > src/egl/drivers/dri2/platform_surfaceless.c | 12 > 1 file changed, 12 insertions(+) > > diff --git a/src/egl/drivers/dri2/platform_surfaceless.c > b/src/egl/drivers/dri2/platform_surfaceless.c > index fcf7d69..a55c5f1 100644 > --- a/src/egl/drivers/dri2/platform_surfaceless.c > +++ b/src/egl/drivers/dri2/platform_surfaceless.c > @@ -178,6 +178,17 @@ dri2_surfaceless_create_pbuffer_surface(_EGLDriver *drv, > _EGLDisplay *disp, > } > > static EGLBoolean > +surfaceless_swap_buffers(_EGLDriver *drv, _EGLDisplay *disp, _EGLSurface > *surf) > +{ > + assert(!surf || surf->Type == EGL_PBUFFER_BIT); > + > + /* From the EGL 1.5 spec: > +*If surface is a [...] pbuffer surface, eglSwapBuffers has no effect. > +*/ > + return EGL_TRUE; > +} > + > +static EGLBoolean > surfaceless_add_configs_for_visuals(_EGLDriver *drv, _EGLDisplay *dpy) > { > struct dri2_egl_display *dri2_dpy = dri2_egl_display(dpy); > @@ -223,6 +234,7 @@ static struct dri2_egl_display_vtbl > dri2_surfaceless_display_vtbl = { > .destroy_surface = surfaceless_destroy_surface, > .create_image = dri2_create_image_khr, > .swap_interval = dri2_fallback_swap_interval, > + .swap_buffers = surfaceless_swap_buffers, > .swap_buffers_with_damage = dri2_fallback_swap_buffers_with_damage, > .swap_buffers_region = dri2_fallback_swap_buffers_region, > .post_sub_buffer = dri2_fallback_post_sub_buffer, > -- > 2.10.0 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev Reviewed-by: Anuj Phogat ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 07/25] mesa/i965: eliminate gl_tess_ctrl_program and use new shared shader_info
On Tuesday, October 18, 2016 8:28:02 AM PDT Jason Ekstrand wrote: > On Tue, Oct 18, 2016 at 8:14 AM, Jason Ekstrand> wrote: > > > I want to make a few comments on how this series is structured. This is > > not the way I would have done it and I think the way you structured it > > makes it substantially less rebasable than it could be and a bit harder to > > review. The way *I* would have done this would be something like the > > following: > > > > 1) Move shader_info to common code (patches 1-2) > > 2) Add a shader_info pointer to gl_program (patch 6), break the fill > > shader_info stuff from glsl_to_nir into its own function, and call it from > > somewhere such that it always gets filled out. > > 3) Add new fields to shader_info *and* make sure they get filled out from > > other GLSL information > > 4) Convert i965 over to the new shader_info > > 5) Convert gallium over to the new shader_info > > 6) Make GLSL fill out shader_info directly and nuke the old shader > > metadata. > > 7) Delete the shader_info fill-out function. > > > > Something along these lines would go a long way towards avoiding the "mega > > patch" problem where each patch touches 4 or 5 different components. It > > also makes it clearer to review because you don't add fields and then the > > reviewer goes "Wait, where does this get set? Oh, in another patch". I'm > > not necessarily saying that you have to go back and change your patches. > > It's more a suggestion for if you end up doing a v3 or another refactor > > along these lines in the future. > > > > On the review side, splitting out as I described above would make it much > easier to review since it would be more-or-less one type of refactor per > patch. In this patch, we have several different kinds of refactors: > > 1) Move consumers over to reading shader_info > 2) Remove gl_tess_ctrl_program and related refactors > 3) Move producer over to writing shader_info > > Normally, when reviewing, I would just skim (2) and give (1) a (3) more > effort. Having them mixed together means I have to pay constant attention > to what's going on. Also, having (2) mixed in makes it harder to verify > (3) because there's a lot of code motion only some of which matters. I agree with Jason. This could be structured a lot more cleanly, and it would make it much easier to review. For example, patches 3-5 add a bunch of new structure fields. But they aren't populated by anything. The CS local_size_variable field finally gets populated in patch 12 (a whole 7 patches later!)...and by the end of the series...I don't see a single consumer of that field. So, the field is useless. But I had to use 'git log -p' on a branch and search through your entire series to determine that. There's far too much context to keep in my head while reading, and it means I have to abandon my usual read-emails-mostly-in-order review process. I actually added TES shader info a little while back (but hadn't sent them out yet as Vulkan tessellation isn't quite ready yet). Here's what my patches looked like: 1) Convert spacing from GLenums to a TESS_SPACING_* enum https://cgit.freedesktop.org/~kwg/mesa/commit/?h=vktess=8b49a8485dd37eb405efcaaecd55244a8f63f213 (simple cleanup I did across the whole codebase) 2) Introduce nir_shader_info fields and populate them in glsl_to_nir and spirv_to_nir. https://cgit.freedesktop.org/~kwg/mesa/commit/?h=vktess=3142efa913965324ad21c3cefc792ab83e1a1390 (fields are at least populated in all frontends, but may be useless) 3) Convert i965 over to use nir_shader_info for fields https://cgit.freedesktop.org/~kwg/mesa/commit/?h=vktess=a518388acc7a6db88c7e21829e7a15b15b9304ad (now the fields are used. admittedly I did some bonus code motion in this patch...if I'm being pedantic, I should have made that a fourth patch to make the prog_data fields be populated in Vulkan paths) The first patch stands alone, and patches 2-3 stand together. All are very small. You need no additional context to answer questions, and can say "those look good" and move on rather quickly. With that in mind, I'd like to ask you to please try and rework this series along the lines that Jason suggested. I know it's a bunch of work, but being disciplined in how we organize our code is a really useful skill that pays off in the long run. When reviewers can look at your code and quickly give a thumbs up, you get to land your patches a lot more quickly, and that extra effort ultimately saves you (and others) a whole lot of time. Sorry, Tim :( --Ken signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 07/22] anv/wsi/x11: abstract WSI interface from internals.
On 17 October 2016 at 05:24, Dave Airliewrote: > From: Dave Airlie > > This allows the API and the internals to be split, and the > internals shared. > --- > src/intel/vulkan/anv_wsi_x11.c | 33 - > 1 file changed, 24 insertions(+), 9 deletions(-) > > diff --git a/src/intel/vulkan/anv_wsi_x11.c b/src/intel/vulkan/anv_wsi_x11.c > index ccaabea..6eb06c3 100644 > --- a/src/intel/vulkan/anv_wsi_x11.c > +++ b/src/intel/vulkan/anv_wsi_x11.c > @@ -233,16 +233,15 @@ visual_has_alpha(xcb_visualtype_t *visual, unsigned > depth) > return (all_mask & ~rgb_mask) != 0; > } > > -VkBool32 anv_GetPhysicalDeviceXcbPresentationSupportKHR( > -VkPhysicalDevicephysicalDevice, > +static VkBool32 anv_get_physical_device_xcb_presentation_support( > +struct anv_wsi_device *wsi_device, > +VkAllocationCallbacks *alloc, Nit: indentation (here and below) seems off. -Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 07/25] mesa/i965: eliminate gl_tess_ctrl_program and use new shared shader_info
On Tue, Oct 18, 2016 at 8:28 AM, Jason Ekstrandwrote: > On Tue, Oct 18, 2016 at 8:14 AM, Jason Ekstrand > wrote: > >> I want to make a few comments on how this series is structured. This is >> not the way I would have done it and I think the way you structured it >> makes it substantially less rebasable than it could be and a bit harder to >> review. The way *I* would have done this would be something like the >> following: >> >> 1) Move shader_info to common code (patches 1-2) >> 2) Add a shader_info pointer to gl_program (patch 6), break the fill >> shader_info stuff from glsl_to_nir into its own function, and call it from >> somewhere such that it always gets filled out. >> 3) Add new fields to shader_info *and* make sure they get filled out >> from other GLSL information >> 4) Convert i965 over to the new shader_info >> 5) Convert gallium over to the new shader_info >> 6) Make GLSL fill out shader_info directly and nuke the old shader >> metadata. >> 7) Delete the shader_info fill-out function. >> > Oh, and one more step: 8) Refactor to get rid of all of the gl_foo_program stuff. (Maybe multiple patches?) > >> Something along these lines would go a long way towards avoiding the >> "mega patch" problem where each patch touches 4 or 5 different components. >> It also makes it clearer to review because you don't add fields and then >> the reviewer goes "Wait, where does this get set? Oh, in another patch". >> I'm not necessarily saying that you have to go back and change your >> patches. It's more a suggestion for if you end up doing a v3 or another >> refactor along these lines in the future. >> > > On the review side, splitting out as I described above would make it much > easier to review since it would be more-or-less one type of refactor per > patch. In this patch, we have several different kinds of refactors: > > 1) Move consumers over to reading shader_info > 2) Remove gl_tess_ctrl_program and related refactors > 3) Move producer over to writing shader_info > > Normally, when reviewing, I would just skim (2) and give (1) a (3) more > effort. Having them mixed together means I have to pay constant attention > to what's going on. Also, having (2) mixed in makes it harder to verify > (3) because there's a lot of code motion only some of which matters. > > >> >> >> On Mon, Oct 17, 2016 at 11:12 PM, Timothy Arceri < >> timothy.arc...@collabora.com> wrote: >> >>> --- >>> src/mesa/drivers/dri/i965/brw_context.h | 6 ++--- >>> src/mesa/drivers/dri/i965/brw_draw.c | 2 +- >>> src/mesa/drivers/dri/i965/brw_program.c | 2 +- >>> src/mesa/drivers/dri/i965/brw_tcs.c | 32 >>> ++- >>> src/mesa/drivers/dri/i965/brw_tcs_surface_state.c | 2 +- >>> src/mesa/drivers/dri/i965/brw_tes.c | 20 +++--- >>> src/mesa/drivers/dri/i965/gen7_hs_state.c | 4 +-- >>> src/mesa/main/context.c | 2 +- >>> src/mesa/main/mtypes.h| 12 + >>> src/mesa/main/shaderapi.c | 4 +-- >>> src/mesa/main/state.c | 11 >>> src/mesa/program/prog_statevars.c | 2 +- >>> src/mesa/program/program.c| 4 +-- >>> src/mesa/program/program.h| 23 >>> src/mesa/state_tracker/st_atom.c | 2 +- >>> src/mesa/state_tracker/st_atom_constbuf.c | 2 +- >>> src/mesa/state_tracker/st_atom_sampler.c | 2 +- >>> src/mesa/state_tracker/st_atom_shader.c | 2 +- >>> src/mesa/state_tracker/st_atom_texture.c | 2 +- >>> src/mesa/state_tracker/st_cb_program.c| 10 +++ >>> src/mesa/state_tracker/st_program.c | 6 ++--- >>> src/mesa/state_tracker/st_program.h | 6 ++--- >>> 22 files changed, 58 insertions(+), 100 deletions(-) >>> >>> diff --git a/src/mesa/drivers/dri/i965/brw_context.h >>> b/src/mesa/drivers/dri/i965/brw_context.h >>> index c92bb9f..9b7e184 100644 >>> --- a/src/mesa/drivers/dri/i965/brw_context.h >>> +++ b/src/mesa/drivers/dri/i965/brw_context.h >>> @@ -337,7 +337,7 @@ struct brw_vertex_program { >>> >>> /** Subclass of Mesa tessellation control program */ >>> struct brw_tess_ctrl_program { >>> - struct gl_tess_ctrl_program program; >>> + struct gl_program program; >>> unsigned id; /**< serial no. to identify tess ctrl progs, never >>> re-used */ >>> }; >>> >>> @@ -1008,7 +1008,7 @@ struct brw_context >>> */ >>> const struct gl_vertex_program *vertex_program; >>> const struct gl_geometry_program *geometry_program; >>> - const struct gl_tess_ctrl_program *tess_ctrl_program; >>> + const struct gl_program *tess_ctrl_program; >>> const struct gl_tess_eval_program *tess_eval_program; >>> const struct gl_fragment_program *fragment_program; >>>
Re: [Mesa-dev] [PATCH 01/22] radv/anv/wsi: drop uneeded parameter
Typo in the summary - s/uneeded/unneeded/ -Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 11/11] anv: drop pointless struct decl.
On 17 October 2016 at 03:07, Dave Airliewrote: > From: Dave Airlie > > Signed-off-by: Dave Airlie Seems like a typo from the development stage - anv_wsi_inter_a_face 10 and 11 are independent so feel free to land whenever possible. Reviewed-by: Emil Velikov -Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/3] [Bug 38970] [bisected]piglit glx/glx-pixmap-multi failed
On 09/29/2016 01:55 PM, Anutex wrote: > I tried to debug this issue with changing the condition to check only bad > magic and Error. > And the test passed. > > Though i am not sure what is the correct behaviour if we are in this > condition. > May be we should make some other condition if the Hash Table have the bucket > data. > --- > src/glx/dri2_glx.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/src/glx/dri2_glx.c b/src/glx/dri2_glx.c > index af388d9..a1fd9ff 100644 > --- a/src/glx/dri2_glx.c > +++ b/src/glx/dri2_glx.c > @@ -411,12 +411,13 @@ dri2CreateDrawable(struct glx_screen *base, XID > xDrawable, >return NULL; > } > > - if (__glxHashInsert(pdp->dri2Hash, xDrawable, pdraw)) { > + if (__glxHashInsert(pdp->dri2Hash, xDrawable, pdraw) == -1) { I'm not 100% sure the existing code is wrong. __glxHashInsert returns -1 for an error, and it returns 1 if the key is already in the hash table. In that case we'll leak the memory for the new pdraw, right? That also seems bad. It seems like instead the code should look up xDrawable in the hash table and return the value that's already there. Maybe. I haven't looked at this code in years, so I may be forgetting some subtlety. >(*psc->core->destroyDrawable) (pdraw->driDrawable); >DRI2DestroyDrawable(psc->base.dpy, xDrawable); >free(pdraw); >return None; > } > + > Spurious whitespace change. > /* > * Make sure server has the same swap interval we do for the new > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 08/11] anv: move to using vk_alloc helpers.
Hi Dave, On 17 October 2016 at 03:07, Dave Airliewrote: > From: Dave Airlie > > This moves all the alloc/free in anv to the generic helpers. > > Signed-off-by: Dave Airlie > --- > src/intel/vulkan/anv_batch_chain.c| 40 +++--- > src/intel/vulkan/anv_cmd_buffer.c | 22 - > src/intel/vulkan/anv_descriptor_set.c | 12 - > src/intel/vulkan/anv_device.c | 26 ++-- > src/intel/vulkan/anv_image.c | 14 +-- > src/intel/vulkan/anv_intel.c | 4 +-- > src/intel/vulkan/anv_pass.c | 10 > src/intel/vulkan/anv_pipeline.c | 6 ++--- > src/intel/vulkan/anv_pipeline_cache.c | 8 +++--- > src/intel/vulkan/anv_private.h| 46 > +-- > src/intel/vulkan/anv_query.c | 6 ++--- > src/intel/vulkan/anv_wsi.c| 2 +- > src/intel/vulkan/anv_wsi_wayland.c| 16 ++-- > src/intel/vulkan/anv_wsi_x11.c| 22 - > src/intel/vulkan/gen7_pipeline.c | 4 +-- > src/intel/vulkan/gen8_pipeline.c | 4 +-- > src/intel/vulkan/genX_pipeline.c | 6 ++--- > src/intel/vulkan/genX_state.c | 2 +- > 18 files changed, 103 insertions(+), 147 deletions(-) > Wondering we one shouldn't include the new header only where needed ? Quick grep shows 33 files which include anv_private.h of which (as per above) ~half only need vk_alloc.h. Just an idea. Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] glsl: optimize list handling in opt_dead_code
> Regarding C++ templates, the compiler doesn't use them. If u_vector > (Dave Airlie?) provides the same functionality as your array, I > suggest we use u_vector instead. Let me repeat what you just wrote, because it is unbelievable: You are advising the use of non-templated collection types in C++ code. > If you can't use u_vector, you should > ask for approval from GLSL compiler leads (e.g. Ian Romanick or > Kenneth Graunke) to use C++ templates. - You are talking about coding rules some Mesa developers agreed upon and didn't bother writing down for other developers to read - I am not willing to use u_vector in C++ code > I'll repeat some stuff about profiling here but also explain my perspective. So far (which may be a year or so), there is no indication that you are better at optimizing code than me. > Never profile with -O0 or disabled function inlining. Seriously? > Mesa uses -g -O2 > with --enable-debug, so that's what you should use too. Don't use any > other -O* variants. What if I find a case where -O2 prevents me from easily seeing information necessary to optimize the source code? > The only profiling tools reporting correct results are perf and > sysprof. I used perf on Metro 2033 Redux and saw do_dead_code() there. Then I used callgrind to see some more code. > (both use the same mechanism) If you don't enable dwarf in > perf (also sysprof can't use dwarf), you have to build Mesa with > -fno-omit-frame-pointer to see call trees. The only reason you would > want to enable dwarf-based call trees is when you want to see libc > calls. Otherwise, they won't be displayed or counted as part of call > trees. For Mesa developers who do profiling often, > -fno-omit-frame-pointer should be your default. > Callgrind counts calls (that one you can trust), but the reported time > is incorrect, Are you nuts? You cannot be seriously be assuming that I didn't know about that. > because it uses its own virtual model of a CPU. Avoid it > if you want to measure time spent in functions. I will *NOT* avoid callgrind because I know how to use it to optimize code. >Marek As usual, I would like to notify reviewers of this path that I am not willing to wait months to learn whether the code will be merged or rejected. If it isn't merged by Thursday (2016-oct-20) I will mark it as rejected (rejected based on personal rather than scientific grounds). Jan ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] st/glsl_to_tgsi: fix block copies of arrays of structs
For the series: Reviewed-by: Marek OlšákMarek On Mon, Oct 17, 2016 at 7:25 PM, Nicolai Hähnle wrote: > From: Nicolai Hähnle > > Use a full writemask in this case. This is relevant e.g. when a function > has an inout argument which is an array of structs. > --- > src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 6 -- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp > b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp > index 1662f7f..b91ebaf 100644 > --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp > +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp > @@ -2964,24 +2964,26 @@ glsl_to_tgsi_visitor::visit(ir_assignment *ir) > > if (variable->data.location == FRAG_RESULT_DEPTH) > l.writemask = WRITEMASK_Z; > else { > assert(variable->data.location == FRAG_RESULT_STENCIL); > l.writemask = WRITEMASK_Y; > } >} else if (ir->write_mask == 0) { > assert(!ir->lhs->type->is_scalar() && !ir->lhs->type->is_vector()); > > - if (ir->lhs->type->is_array() || ir->lhs->type->is_matrix()) { > -unsigned num_elements = > ir->lhs->type->without_array()->vector_elements; > + unsigned num_elements = > ir->lhs->type->without_array()->vector_elements; > + > + if (num_elements) { > l.writemask = u_bit_consecutive(0, num_elements); > } else { > +// The type is a struct or an array of (array of) structs. > l.writemask = WRITEMASK_XYZW; > } >} else { > l.writemask = ir->write_mask; >} > >for (int i = 0; i < 4; i++) { > if (l.writemask & (1 << i)) { > first_enabled_chan = GET_SWZ(r.swizzle, i); > break; > -- > 2.7.4 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 05/16] loader: reimplement loader_get_user_preferred_fd via libdrm
On 18.10.2016 18:01, Emil Velikov wrote: On 18 October 2016 at 09:49, Nicolai Hähnlewrote: On 14.10.2016 20:21, Emil Velikov wrote: From: Emil Velikov Currently not everyone has libudev and with follow-up patches we'll completely remove the divergent codepaths. Use the libdrm drm device API to construct the required ID_PATH_TAG-like string, to preserve the current functionality for libudev users and allow others to benefit from it as well. v2: Drop ranty comments, pick the correct device Cc: Axel Davy Signed-off-by: Emil Velikov --- src/loader/loader.c | 247 ++-- 1 file changed, 106 insertions(+), 141 deletions(-) diff --git a/src/loader/loader.c b/src/loader/loader.c index ad4f946..06df05b 100644 --- a/src/loader/loader.c +++ b/src/loader/loader.c [snip] @@ -321,17 +232,60 @@ static char *loader_get_dri_config_device_id(void) } #endif +static char *drm_construct_id_path_tag(drmDevicePtr device) +{ +/* Length of "pci-_xx_xx_x\n" */ +#define PCI_ID_PATH_TAG_LENGTH 17 + char *tag = NULL; + + if (device->bustype == DRM_BUS_PCI) { +tag = calloc(PCI_ID_PATH_TAG_LENGTH, sizeof(char)); +if (tag == NULL) +return NULL; + +sprintf(tag, "pci-%04x_%02x_%02x_%1u", device->businfo.pci->domain, +device->businfo.pci->bus, device->businfo.pci->dev, +device->businfo.pci->func); Defensive programming would suggest to use snprintf. Correct. It's more like extra defensive in this case but will fix. Thanks :) [snip] @@ -345,55 +299,66 @@ int loader_get_user_preferred_fd(int default_fd, int *different_device) return default_fd; } - udev = udev_new(); - if (!udev) - goto prime_clean; + default_tag = drm_get_id_path_tag_for_fd(default_fd); + if (default_tag == NULL) + goto err; - default_device_id_path_tag = get_id_path_tag_from_fd(udev, default_fd); - if (!default_device_id_path_tag) - goto udev_clean; + num_devices = drmGetDevices(devices, MAX_DRM_DEVICES); + if (num_devices < 0) + goto err; - is_different_device = 1; /* two format are supported: * "1": choose any other card than the card used by default. * id_path_tag: (for example "pci-_02_00_0") choose the card * with this id_path_tag. */ if (!strcmp(prime,"1")) { - free(prime); - prime = strdup(default_device_id_path_tag); - /* request a card with a different card than the default card */ - another_tag = 1; - } else if (!strcmp(default_device_id_path_tag, prime)) - /* we are to get a new fd (render-node) of the same device */ - is_different_device = 0; - - device_name = get_render_node_from_id_path_tag(udev, - prime, - another_tag); - if (device_name == NULL) { - is_different_device = 0; - goto default_device_clean; + /* Hmm... detection for 2-7 seems to be broken. Oh well ... + * Pick the first render device that is not our own. + */ + for (i = 0; i < num_devices; i++) { + if (devices[i]->available_nodes & 1 << DRM_NODE_RENDER && + !drm_device_matches_tag(devices[i], default_tag)) { + +found = true; +break; + } + } + } else { + for (i = 0; i < num_devices; i++) { + if (devices[i]->available_nodes & 1 << DRM_NODE_RENDER && +drm_device_matches_tag(devices[i], prime)) { + +found = true; +break; + } + } I feel like it would be helpful to have a warning here if the device was not found. This could avoid some confusion when people inevitably typo their prime setting. Original code does not have such a message, so let's add it as follow-up ? Fine by me. Cheers, Nicolai Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] egl/surfaceless: Fix segfault in eglSwapBuffers
Since commit 63c5d5c6c46c8472ee7a8241a0f80f13d79cb8cd, the surfaceless platform has allowed creation of pbuffer surfaces. But the vtable entry for eglSwapBuffers has remained NULL. Discovered by running a little pbuffer test. Cc: Gurchetan Singh--- src/egl/drivers/dri2/platform_surfaceless.c | 12 1 file changed, 12 insertions(+) diff --git a/src/egl/drivers/dri2/platform_surfaceless.c b/src/egl/drivers/dri2/platform_surfaceless.c index fcf7d69..a55c5f1 100644 --- a/src/egl/drivers/dri2/platform_surfaceless.c +++ b/src/egl/drivers/dri2/platform_surfaceless.c @@ -178,6 +178,17 @@ dri2_surfaceless_create_pbuffer_surface(_EGLDriver *drv, _EGLDisplay *disp, } static EGLBoolean +surfaceless_swap_buffers(_EGLDriver *drv, _EGLDisplay *disp, _EGLSurface *surf) +{ + assert(!surf || surf->Type == EGL_PBUFFER_BIT); + + /* From the EGL 1.5 spec: +*If surface is a [...] pbuffer surface, eglSwapBuffers has no effect. +*/ + return EGL_TRUE; +} + +static EGLBoolean surfaceless_add_configs_for_visuals(_EGLDriver *drv, _EGLDisplay *dpy) { struct dri2_egl_display *dri2_dpy = dri2_egl_display(dpy); @@ -223,6 +234,7 @@ static struct dri2_egl_display_vtbl dri2_surfaceless_display_vtbl = { .destroy_surface = surfaceless_destroy_surface, .create_image = dri2_create_image_khr, .swap_interval = dri2_fallback_swap_interval, + .swap_buffers = surfaceless_swap_buffers, .swap_buffers_with_damage = dri2_fallback_swap_buffers_with_damage, .swap_buffers_region = dri2_fallback_swap_buffers_region, .post_sub_buffer = dri2_fallback_post_sub_buffer, -- 2.10.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] svga: minor code improvements in svga_validate_pipe_sampler_view()
Use the 'texture' local var in more places. Rename 'pFormat' to 'viewFormat'. --- src/gallium/drivers/svga/svga_state_sampler.c | 16 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/src/gallium/drivers/svga/svga_state_sampler.c b/src/gallium/drivers/svga/svga_state_sampler.c index 53bb80f..445afcc 100644 --- a/src/gallium/drivers/svga/svga_state_sampler.c +++ b/src/gallium/drivers/svga/svga_state_sampler.c @@ -135,21 +135,21 @@ svga_validate_pipe_sampler_view(struct svga_context *svga, SVGA3dSurfaceFormat format; SVGA3dResourceType resourceDim; SVGA3dShaderResourceViewDesc viewDesc; - enum pipe_format pformat = sv->base.format; + enum pipe_format viewFormat = sv->base.format; /* vgpu10 cannot create a BGRX view for a BGRA resource, so force it to * create a BGRA view (and vice versa). */ - if (pformat == PIPE_FORMAT_B8G8R8X8_UNORM && - sv->base.texture->format == PIPE_FORMAT_B8G8R8A8_UNORM) { - pformat = PIPE_FORMAT_B8G8R8A8_UNORM; + if (viewFormat == PIPE_FORMAT_B8G8R8X8_UNORM && + texture->format == PIPE_FORMAT_B8G8R8A8_UNORM) { + viewFormat = PIPE_FORMAT_B8G8R8A8_UNORM; } - else if (pformat == PIPE_FORMAT_B8G8R8A8_UNORM && - sv->base.texture->format == PIPE_FORMAT_B8G8R8X8_UNORM) { - pformat = PIPE_FORMAT_B8G8R8X8_UNORM; + else if (viewFormat == PIPE_FORMAT_B8G8R8A8_UNORM && + texture->format == PIPE_FORMAT_B8G8R8X8_UNORM) { + viewFormat = PIPE_FORMAT_B8G8R8X8_UNORM; } - format = svga_translate_format(ss, pformat, + format = svga_translate_format(ss, viewFormat, PIPE_BIND_SAMPLER_VIEW); assert(format != SVGA3D_FORMAT_INVALID); -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 6/6] radeonsi: rename prefixes from radeon to si
On Tue, Oct 18, 2016 at 6:28 PM, Emil Velikovwrote: > On 17 October 2016 at 14:44, Marek Olšák wrote: >> From: Marek Olšák >> >> --- >> src/gallium/drivers/radeonsi/si_pipe.c | 2 +- >> src/gallium/drivers/radeonsi/si_shader.c | 96 ++--- >> src/gallium/drivers/radeonsi/si_shader_internal.h | 70 +- >> .../drivers/radeonsi/si_shader_tgsi_setup.c| 150 >> ++--- >> 4 files changed, 159 insertions(+), 159 deletions(-) >> > From build POV everything is perfect thanks Marek ! For those > Reviewed-by: Emil Velikov Thanks. > > Humble suggestion - set the following for friendlier patches ;-) > $ git config --global diff.renames true I forgot to remove one (almost empty) file, so a rename wasn't detected properly. I do send all my Mesa patches with git send-email -M. Marek ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 6/6] radeonsi: rename prefixes from radeon to si
On 17 October 2016 at 14:44, Marek Olšákwrote: > From: Marek Olšák > > --- > src/gallium/drivers/radeonsi/si_pipe.c | 2 +- > src/gallium/drivers/radeonsi/si_shader.c | 96 ++--- > src/gallium/drivers/radeonsi/si_shader_internal.h | 70 +- > .../drivers/radeonsi/si_shader_tgsi_setup.c| 150 > ++--- > 4 files changed, 159 insertions(+), 159 deletions(-) > From build POV everything is perfect thanks Marek ! For those Reviewed-by: Emil Velikov Humble suggestion - set the following for friendlier patches ;-) $ git config --global diff.renames true Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] radeonsi: eliminate trivial constant VS outputs
From: Marek OlšákThese constant value VS PARAM exports: - 0,0,0,0 - 0,0,0,1 - 1,1,1,0 - 1,1,1,1 can be loaded into PS inputs using the DEFAULT_VAL field, and the VS exports can be removed from the IR to save export & parameter memory. After LLVM optimizations, analyze the IR to see which exports are equal to the ones listed above (or undef) and remove them if they are. Targeted use cases: - All DX9 eON ports always clear 10 VS outputs to 0.0 even if most of them are unused by PS (such as Witcher 2 below). - VS output arrays with unused elements that the GLSL compiler can't eliminate (such as Batman below). The shader-db deltas are quite interesting: (not from upstream si-report.py, it won't be upstreamed) PERCENTAGE DELTASShaders PARAM exports (affected only) batman_arkham_origins589 -67.17 % bioshock-infinite 1769 -0.47 % dirt-showdown548 -2.68 % dota2 1747 -3.36 % f1-2015 776 -4.94 % left_4_dead_2 1762 -0.07 % metro_2033_redux2670 -0.43 % portal 474 -0.22 % talos_principle 324 -3.63 % warsow 176 -2.20 % witcher21040 -73.78 % All affected 991 -65.37 % ... 9681 -> 3353 Total 26725 -10.82 % ... 58490 -> 52162 --- src/gallium/drivers/radeonsi/si_shader.c| 154 src/gallium/drivers/radeonsi/si_shader.h| 11 ++ src/gallium/drivers/radeonsi/si_state_shaders.c | 17 ++- 3 files changed, 180 insertions(+), 2 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_shader.c b/src/gallium/drivers/radeonsi/si_shader.c index a361418..7fc1df4 100644 --- a/src/gallium/drivers/radeonsi/si_shader.c +++ b/src/gallium/drivers/radeonsi/si_shader.c @@ -6593,20 +6593,167 @@ static void si_init_shader_ctx(struct si_shader_context *ctx, bld_base->op_actions[TGSI_OPCODE_EMIT].emit = si_llvm_emit_vertex; bld_base->op_actions[TGSI_OPCODE_ENDPRIM].emit = si_llvm_emit_primitive; bld_base->op_actions[TGSI_OPCODE_BARRIER].emit = si_llvm_emit_barrier; bld_base->op_actions[TGSI_OPCODE_MAX].emit = build_tgsi_intrinsic_nomem; bld_base->op_actions[TGSI_OPCODE_MAX].intr_name = "llvm.maxnum.f32"; bld_base->op_actions[TGSI_OPCODE_MIN].emit = build_tgsi_intrinsic_nomem; bld_base->op_actions[TGSI_OPCODE_MIN].intr_name = "llvm.minnum.f32"; } +/* Return true if the PARAM export has been eliminated. */ +static bool si_eliminate_const_output(struct si_shader_context *ctx, + LLVMValueRef inst, unsigned offset) +{ + struct si_shader *shader = ctx->shader; + unsigned num_outputs = shader->selector->info.num_outputs; + double v[4]; + unsigned i, default_val; /* SPI_PS_INPUT_CNTL_i.DEFAULT_VAL */ + + for (i = 0; i < 4; i++) { + LLVMBool loses_info; + LLVMValueRef p = LLVMGetOperand(inst, 5 + i); + if (!LLVMIsConstant(p)) + return false; + + /* It's a constant expression. Undef outputs are eliminated too. */ + if (LLVMIsUndef(p)) + v[i] = 0; + else + v[i] = LLVMConstRealGetDouble(p, _info); + + if (v[i] != 0 && v[i] != 1) + return false; + } + + /* Only certain combinations of 0 and 1 can be eliminated. */ + if (v[0] == 0 && v[1] == 0 && v[2] == 0) + default_val = v[3] == 0 ? 0 : 1; + else if (v[0] == 1 && v[1] == 1 && v[2] == 1) + default_val = v[3] == 0 ? 2 : 3; + else + return false; + + /* The PARAM export can be represented as DEFAULT_VAL. Kill it. */ + LLVMInstructionEraseFromParent(inst); + + /* Change OFFSET to DEFAULT_VAL. */ + for (i = 0; i < num_outputs; i++) { + if (shader->info.vs_output_param_offset[i] == offset) { + shader->info.vs_output_param_offset[i] = + EXP_PARAM_DEFAULT_VAL_ + default_val; + break; + } + } + return true; +} + +struct si_vs_exports { + unsigned num; + unsigned offset[SI_MAX_VS_OUTPUTS]; + LLVMValueRef inst[SI_MAX_VS_OUTPUTS]; +}; + +static void si_eliminate_const_vs_outputs(struct si_shader_context *ctx) +{ + struct si_shader *shader = ctx->shader; + struct tgsi_shader_info *info = >selector->info; + LLVMBasicBlockRef bb; + struct si_vs_exports exports; + bool removed_any = false; + + exports.num = 0; + + if ((ctx->type == PIPE_SHADER_VERTEX && +(shader->key.vs.as_es || shader->key.vs.as_ls)) || + (ctx->type == PIPE_SHADER_TESS_EVAL && shader->key.tes.as_es)) +
Re: [Mesa-dev] [PATCH] glsl: optimize list handling in opt_dead_code
On Tue, Oct 18, 2016 at 3:55 PM, Eero Tamminenwrote: > Hi, > > On 18.10.2016 16:25, Jan Ziak wrote: >> >> On Tue, Oct 18, 2016 at 3:12 PM, Nicolai Hähnle >> wrote: >>> >>> On 18.10.2016 15:07, Jan Ziak wrote: On Tue Oct 18 09:29:59 UTC 2016, Eero Tamminen wrote: > > On 18.10.2016 01:07, Jan Ziak wrote: >> >> - The total number of executed instructions goes down from 64.184 to >> 63.797 >> giga-instructions when Mesa is compiled with "gcc -O0 ..." > > > Please don't do performance related decisions based on data from > compiling code with optimizations disabled. Use -O2 or -O3 (or even > better, check both). Options -O2 and -O3 interfere with profiling tools. I will try using -Og the next time. >>> >>> >>> Just stop and use proper profiling tools like perf that can work with >>> optimized tools. > > > Valgrind/callgrind/cachegrind works also fine with optimized binaries. > > All profiling tools lie, at least a bit. It's better to know their strengths > and weaknesses so that one knows which ones complement each other. Perf is > e.g. good at finding hotspots, Valgrind (callgrind) is more reliable in > telling how they get called. > > One may also needs GCC version from this decade. Really old GCC versions > didn't inlude all debug info needed for debugging optimized binaries. Regarding C++ templates, the compiler doesn't use them. If u_vector (Dave Airlie?) provides the same functionality as your array, I suggest we use u_vector instead. If you can't use u_vector, you should ask for approval from GLSL compiler leads (e.g. Ian Romanick or Kenneth Graunke) to use C++ templates. I'll repeat some stuff about profiling here but also explain my perspective. Never profile with -O0 or disabled function inlining. Mesa uses -g -O2 with --enable-debug, so that's what you should use too. Don't use any other -O* variants. The only profiling tools reporting correct results are perf and sysprof. (both use the same mechanism) If you don't enable dwarf in perf (also sysprof can't use dwarf), you have to build Mesa with -fno-omit-frame-pointer to see call trees. The only reason you would want to enable dwarf-based call trees is when you want to see libc calls. Otherwise, they won't be displayed or counted as part of call trees. For Mesa developers who do profiling often, -fno-omit-frame-pointer should be your default. Callgrind counts calls (that one you can trust), but the reported time is incorrect, because it uses its own virtual model of a CPU. Avoid it if you want to measure time spent in functions. Marek ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev