Re: [Mesa-dev] [PATCH] GL_OES_texture_float and GL_OES_texture_half_float support
I'll post a corrected patch shortly, but first some answer to questions: Have you found an application that wants these extensions? That might be useful to describe in the commit message. There are some occasional bits in Qt that uses HALF_FLOAT_OES for GLES2 (but I stress it is not a often hit code path), many forms of GPU font rendering also require floating point textures (to which I confess implementing in a public open source project). As for specific applications: I do not have a list. For which driver would that be useful? The truth right now is the for the current Mesa DRI drivers: I think none, the reason being that Mesa drivers are -mostly- either newer hardware OR if older -desktop- hardware. The extensions made them all separate to let an implementation expose more by requiring less. As for hardware out there is the wild, I -suspect- not anyways. -Kevin ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Relax accumulator dependency scheduling on Gen 6
On Tue, 2014-05-06 at 11:45 -0700, Matt Turner wrote: Nice work. On Tue, May 6, 2014 at 1:16 AM, Iago Toral Quiroga ito...@igalia.com wrote: diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp b/src/mesa/drivers/dri/i965/brw_shader.cpp index 6e74803..37d3eab 100644 --- a/src/mesa/drivers/dri/i965/brw_shader.cpp +++ b/src/mesa/drivers/dri/i965/brw_shader.cpp @@ -676,6 +676,13 @@ backend_instruction::reads_accumulator_implicitly() const } bool +backend_instruction::writes_accumulator_implicitly(int gen) const +{ + return writes_accumulator || + (gen 6 opcode = BRW_OPCODE_ADD opcode != BRW_OPCODE_NOP); Since our virtual instruction opcodes are BRW_OPCODE_NOP, they'll also be classified as writing the accumulator, whereas before they weren't. I think the only ones (that are used on gen 6) that generate hardware instructions that write the accumulator are FS_OPCODE_DDX FS_OPCODE_DDY FS_OPCODE_PIXEL_X FS_OPCODE_PIXEL_Y FS_OPCODE_CINTERP After a quick look it looks like FS_OPCODE_CINTERP is implemented as a MOV operation according to code in brw_fs_generator.cpp: case FS_OPCODE_CINTERP: brw_MOV(p, dst, src[0]); break; so I guess this one will not write the accumulator in any case. If I am missing something here, please let me know. FS_OPCODE_LINTERP If you update this function with these and it still passes piglit on gen 6, then this patch is I'll update the patch to include these (minus FS_OPCODE_CINTERP) and check the piglit test again. Thanks for the review! Iago Reviewed-by: Matt Turner matts...@gmail.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] gallium: remove enum numbers from shader cap queries
On 05.05.2014 22:39, Brian Paul wrote: On 05/03/2014 07:43 AM, Michel Dänzer wrote: On 03.05.2014 22:29, Brian Paul wrote: The enum numbers were just cruft. I disagree. Nothing's changed about the reason I added them in the first place: When a driver is queried for a cap it doesn't know about, it prints an error message containing only the numeric value of the cap. These explicit numbers make it easy to find out which cap the driver is complaining about. Hi Michel, In the past when someone added a new enum and softpipe, llvmpipe or svga complained at runtime about an unhandled num, it's been pretty easy to spot the new one and fix it. Actually, what you have in the radeon/si drivers is better: switch statements w/out default cases. So the compiler will warn about the missing enum case by name (not number). That's a more effective way of catching unhandled enums earlier. I should change softpipe, llvmpipe and svga to do the same. How does that sound? Are there other drivers you're concerned about? Not really, sounds good to me then. -- Earthling Michel Dänzer| http://www.amd.com Libre software enthusiast |Mesa and X developer ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2] i965: Relax accumulator dependency scheduling on Gen 6
Many instructions implicitly update the accumulator on Gen 6. The instruction scheduling code just calls add_barrier_deps() for each accumulator access on these platforms, but a large class of operations don't actually update the accumulator -- mostly move and logical instructions. Teaching the scheduling code about this would allow more flexibility to schedule instructions. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77740 --- This version properly identifies virtual instructions that write to the accumulator in Gen 6 as indicated by Matt. FS_OPCODE_CINTERP is excluded because it seems to be implemented as a MOV. Passes piglit tests on IronLake. .../drivers/dri/i965/brw_schedule_instructions.cpp | 84 +++--- src/mesa/drivers/dri/i965/brw_shader.cpp | 10 +++ src/mesa/drivers/dri/i965/brw_shader.h | 1 + 3 files changed, 36 insertions(+), 59 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp b/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp index 8cc6908..6f8f405 100644 --- a/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp +++ b/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp @@ -742,8 +742,6 @@ fs_instruction_scheduler::is_compressed(fs_inst *inst) void fs_instruction_scheduler::calculate_deps() { - const bool gen6plus = v-brw-gen = 6; - /* Pre-register-allocation, this tracks the last write per VGRF (so * different reg_offsets within it can interfere when they shouldn't). * After register allocation, reg_offsets are gone and we track individual @@ -803,7 +801,7 @@ fs_instruction_scheduler::calculate_deps() } else { add_dep(last_fixed_grf_write, n); } - } else if (inst-src[i].is_accumulator() gen6plus) { + } else if (inst-src[i].is_accumulator()) { add_dep(last_accumulator_write, n); } else if (inst-src[i].file != BAD_FILE inst-src[i].file != IMM @@ -828,11 +826,7 @@ fs_instruction_scheduler::calculate_deps() } if (inst-reads_accumulator_implicitly()) { - if (gen6plus) { -add_dep(last_accumulator_write, n); - } else { -add_barrier_deps(n); - } + add_dep(last_accumulator_write, n); } /* write-after-write deps. */ @@ -867,7 +861,7 @@ fs_instruction_scheduler::calculate_deps() } else { last_fixed_grf_write = n; } - } else if (inst-dst.is_accumulator() gen6plus) { + } else if (inst-dst.is_accumulator()) { add_dep(last_accumulator_write, n); last_accumulator_write = n; } else if (inst-dst.file != BAD_FILE @@ -887,13 +881,10 @@ fs_instruction_scheduler::calculate_deps() last_conditional_mod[inst-flag_subreg] = n; } - if (inst-writes_accumulator) { - if (gen6plus) { -add_dep(last_accumulator_write, n); -last_accumulator_write = n; - } else { -add_barrier_deps(n); - } + if (inst-writes_accumulator_implicitly(v-brw-gen) + !inst-dst.is_accumulator()) { + add_dep(last_accumulator_write, n); + last_accumulator_write = n; } } @@ -933,7 +924,7 @@ fs_instruction_scheduler::calculate_deps() } else { add_dep(n, last_fixed_grf_write); } - } else if (inst-src[i].is_accumulator() gen6plus) { + } else if (inst-src[i].is_accumulator()) { add_dep(n, last_accumulator_write); } else if (inst-src[i].file != BAD_FILE inst-src[i].file != IMM @@ -958,11 +949,7 @@ fs_instruction_scheduler::calculate_deps() } if (inst-reads_accumulator_implicitly()) { - if (gen6plus) { -add_dep(n, last_accumulator_write); - } else { -add_barrier_deps(n); - } + add_dep(n, last_accumulator_write); } /* Update the things this instruction wrote, so earlier reads @@ -996,7 +983,7 @@ fs_instruction_scheduler::calculate_deps() } else { last_fixed_grf_write = n; } - } else if (inst-dst.is_accumulator() gen6plus) { + } else if (inst-dst.is_accumulator()) { last_accumulator_write = n; } else if (inst-dst.file != BAD_FILE !inst-dst.is_null()) { @@ -1013,12 +1000,8 @@ fs_instruction_scheduler::calculate_deps() last_conditional_mod[inst-flag_subreg] = n; } - if (inst-writes_accumulator) { - if (gen6plus) { -last_accumulator_write = n; - } else { -add_barrier_deps(n); - } + if (inst-writes_accumulator_implicitly(v-brw-gen)) { + last_accumulator_write = n; } } } @@ -1026,8 +1009,6 @@ fs_instruction_scheduler::calculate_deps() void vec4_instruction_scheduler::calculate_deps() { - const
Re: [Mesa-dev] [PATCH 0/2] Varying packing support for arrays_of_arrays
As I'm sending this extension bit by bit I thought I'd better list the outstanding items. I've listed them in the order I'm planning to implement them. - update backends (gallium/i965) to support in/out arrays of arrays - write some piglit tests to double check that the geom shaders front end is correct. I haven't fully checked this yet. - implement support for uniforms and write/update execution piglit tests for this - add support for partial marking of arrays of arrays in ir_set_program_inouts.cpp Also I thought I'd point out that in test I linked to arrays of arrays tests for any type of vec4 currently fails as obviously this doesn't require packing and the backend support is not yet implemented. On Tue, 2014-05-06 at 13:50 +1000, Timothy Arceri wrote: Patch 1 is untested as the i965 backend supports packing but it should be easy to review. Patch 2 has been tested with an updated version [1] of the simple packing piglit test. [1] http://lists.freedesktop.org/archives/piglit/2014-May/010639.html ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] glsl: support packing of arrays of arrays
Just to make reviewing a quick task. I thought I'd just point out the reason updating this check is all that is needed to support packed varyings is because lower_arraylike() and lower_rvalue() already recursively call each other in inner array to outer array order so everything just works. Also I removed the assert for geom shaders as it is already checked in lower_rvalue(). On Tue, 2014-05-06 at 13:50 +1000, Timothy Arceri wrote: Signed-off-by: Timothy Arceri t_arc...@yahoo.com.au --- src/glsl/lower_packed_varyings.cpp | 7 ++- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/src/glsl/lower_packed_varyings.cpp b/src/glsl/lower_packed_varyings.cpp index e865474..dd2e22e 100644 --- a/src/glsl/lower_packed_varyings.cpp +++ b/src/glsl/lower_packed_varyings.cpp @@ -591,12 +591,9 @@ lower_packed_varyings_visitor::needs_lowering(ir_variable *var) return false; const glsl_type *type = var-type; - if (this-gs_input_vertices != 0) { - assert(type-is_array()); - type = type-element_type(); - } - if (type-is_array()) + while (type-is_array()) { type = type-fields.array; + } if (type-vector_elements == 4) return false; return true; ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] mesa: Expose GL_OES_texture_float and GL_OES_texture_half_float.
Add support for GLES2 extensions for floating point and half floating point textures (GL_OES_texture_float, GL_OES_texture_half_float, GL_OES_texture_float_linear and GL_OES_texture_half_float_linear). --- src/mesa/main/extensions.c | 12 +- src/mesa/main/glformats.c | 25 src/mesa/main/pack.c | 17 + src/mesa/main/teximage.c | 59 ++ 4 files changed, 112 insertions(+), 1 deletion(-) diff --git a/src/mesa/main/extensions.c b/src/mesa/main/extensions.c index c2ff7e3..e39f65e 100644 --- a/src/mesa/main/extensions.c +++ b/src/mesa/main/extensions.c @@ -301,7 +301,17 @@ static const struct extension extension_table[] = { { GL_OES_texture_mirrored_repeat, o(dummy_true), ES1, 2005 }, { GL_OES_texture_npot, o(ARB_texture_non_power_of_two), ES1 | ES2, 2005 }, { GL_OES_vertex_array_object, o(dummy_true), ES1 | ES2, 2010 }, - + /* +* TODO: +* - rather than have an all or nothing approach for floating point textures, +*allow for driver to specify what parts of floating point texture functionality +*is supported: float/half-float and filtering for each. +*/ + { GL_OES_texture_float, o(ARB_texture_float), ES2,2005 }, + { GL_OES_texture_half_float, o(ARB_texture_float), ES2,2005 }, + { GL_OES_texture_float_linear,o(ARB_texture_float), ES2,2005 }, + { GL_OES_texture_half_float_linear, o(ARB_texture_float), ES2,2005 }, + /* KHR extensions */ { GL_KHR_debug, o(dummy_true), GL, 2012 }, diff --git a/src/mesa/main/glformats.c b/src/mesa/main/glformats.c index 9bb341c..093fd59 100644 --- a/src/mesa/main/glformats.c +++ b/src/mesa/main/glformats.c @@ -93,6 +93,7 @@ _mesa_sizeof_type(GLenum type) case GL_DOUBLE: return sizeof(GLdouble); case GL_HALF_FLOAT_ARB: + case GL_HALF_FLOAT_OES: return sizeof(GLhalfARB); case GL_FIXED: return sizeof(GLfixed); @@ -125,6 +126,7 @@ _mesa_sizeof_packed_type(GLenum type) case GL_INT: return sizeof(GLint); case GL_HALF_FLOAT_ARB: + case GL_HALF_FLOAT_OES: return sizeof(GLhalfARB); case GL_FLOAT: return sizeof(GLfloat); @@ -243,6 +245,7 @@ _mesa_bytes_per_pixel(GLenum format, GLenum type) case GL_FLOAT: return comps * sizeof(GLfloat); case GL_HALF_FLOAT_ARB: + case GL_HALF_FLOAT_OES: return comps * sizeof(GLhalfARB); case GL_UNSIGNED_BYTE_3_3_2: case GL_UNSIGNED_BYTE_2_3_3_REV: @@ -1365,6 +1368,11 @@ _mesa_error_check_format_and_type(const struct gl_context *ctx, case GL_FLOAT: case GL_HALF_FLOAT: return GL_NO_ERROR; +case GL_HALF_FLOAT_OES: + return (format == GL_LUMINANCE || + format == GL_LUMINANCE_ALPHA || + format == GL_ALPHA) + ? GL_NO_ERROR: GL_INVALID_ENUM; default: return GL_INVALID_ENUM; } @@ -1401,6 +1409,9 @@ _mesa_error_check_format_and_type(const struct gl_context *ctx, case GL_UNSIGNED_SHORT_5_6_5_REV: case GL_HALF_FLOAT: return GL_NO_ERROR; +case GL_HALF_FLOAT_OES: + return (format == GL_RGB) + ? GL_NO_ERROR: GL_INVALID_ENUM; case GL_UNSIGNED_INT_2_10_10_10_REV: /* OK by GL_EXT_texture_type_2_10_10_10_REV */ return (ctx-API == API_OPENGLES2) @@ -1454,6 +1465,9 @@ _mesa_error_check_format_and_type(const struct gl_context *ctx, case GL_UNSIGNED_INT_2_10_10_10_REV: case GL_HALF_FLOAT: return GL_NO_ERROR; +case GL_HALF_FLOAT_OES: + return (format == GL_RGBA) + ? GL_NO_ERROR: GL_INVALID_ENUM; default: return GL_INVALID_ENUM; } @@ -1676,6 +1690,17 @@ GLenum _mesa_es3_error_check_format_and_type(GLenum format, GLenum type, GLenum internalFormat) { + /* special case checking for support the GLES2 extension +* GL_OES_texture_float and GL_OES_texture_half_float +*/ + if(format == internalFormat + (type == GL_HALF_FLOAT_OES || type == GL_FLOAT) + (format == GL_RGBA || format == GL_RGB || + format == GL_LUMINANCE || format == GL_ALPHA || + format == GL_LUMINANCE_ALPHA) ) { + return GL_NO_ERROR; + } + switch (format) { case GL_RGBA: switch (type) { diff --git a/src/mesa/main/pack.c
Re: [Mesa-dev] [PATCH 0/2] i965: Simulate MAD opcode with gen6
On Tue, May 6, 2014 at 7:32 PM, Eric Anholt e...@anholt.net wrote: Juha-Pekka Heikkila juhapekka.heikk...@gmail.com writes: These patches allow MAD opcode to be used with pre gen6 hardware. Instead of failing on emitting MAD there will be emitted MUL and ADD to simulate MAD. I tried this with piglit on ILK (gen5) and did not see regression. This hides the MUL and ADD from instruction scheduling, which I expect to make performance worse. What was the motivation for this? When talking about LRP with MAC there was brief comment about doing similar for MAD. Now that you mentioned I realize this first try was not so useful. I at first tried to go using accumulator as an attempt to benefit one register worth here but that turned out to be crash'n'burn, literally. I'll look at Matt's version with LINE next for this. /Juha-Pekka ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] i965/fs: Simulate MAD opcode with gen6
On Tue, May 6, 2014 at 7:37 PM, Matt Turner matts...@gmail.com wrote: On Tue, May 6, 2014 at 3:53 AM, Juha-Pekka Heikkila juhapekka.heikk...@gmail.com wrote: Signed-off-by: Juha-Pekka Heikkila juhapekka.heikk...@gmail.com --- src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 15 ++- 1 file changed, 10 insertions(+), 5 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp index d2dc5fa..22ca528 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp @@ -293,10 +293,6 @@ fs_visitor::try_emit_saturate(ir_expression *ir) bool fs_visitor::try_emit_mad(ir_expression *ir) { - /* 3-src instructions were introduced in gen6. */ - if (brw-gen 6) - return false; - /* MAD can only handle floating-point data. */ if (ir-type != glsl_type::float_type) return false; @@ -327,7 +323,16 @@ fs_visitor::try_emit_mad(ir_expression *ir) fs_reg src2 = this-result; this-result = fs_reg(this, ir-type); - emit(BRW_OPCODE_MAD, this-result, src0, src1, src2); + + /* 3-src instructions were introduced in gen6. */ + if (brw-gen 6) { + fs_reg temp = fs_reg(this, glsl_type::float_type); + + emit(MUL(temp, src1, src2)); + emit(ADD(this-result, src0, temp)); + } else { + emit(BRW_OPCODE_MAD, this-result, src0, src1, src2); + } return true; } -- 1.8.1.2 try_emit_mad is called every time we visit an add-expression, and on platforms that don't have MAD it fails and the compiler generates standard code for the expression tree. So, if your expression tree was a a multiply-add the compiler will generate a multiply and an add instruction. Adding code to make try_emit_mad do that doesn't actually change anything. I've made a branch that uses the LINE instruction to perform multiply-adds when the arguments are immediates. Minus the shader size explosion in unigine tropics, it seems to be a pretty nice improvement. But the problem with unigine will have to be sorted out before it can be committed. Maybe you'd be interested in taking a look at that? See https://bugs.freedesktop.org/show_bug.cgi?id=77544 I will take a look at your branch, thanks Matt. /Juha-Pekka ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] st/wgl: Honour request of 3.1 contexts through core profile where available.
From: José Fonseca jfons...@vmware.com Port 5f493eed69f6fb11239c04119d602f1c23a68cbd from GLX. --- src/gallium/state_trackers/wgl/stw_context.c | 17 +++-- 1 file changed, 15 insertions(+), 2 deletions(-) diff --git a/src/gallium/state_trackers/wgl/stw_context.c b/src/gallium/state_trackers/wgl/stw_context.c index 3a93091..43186fa 100644 --- a/src/gallium/state_trackers/wgl/stw_context.c +++ b/src/gallium/state_trackers/wgl/stw_context.c @@ -205,10 +205,23 @@ stw_create_context_attribs( * * The default value for WGL_CONTEXT_PROFILE_MASK_ARB is * WGL_CONTEXT_CORE_PROFILE_BIT_ARB. +* +* The spec also says: +* +* If version 3.1 is requested, the context returned may implement +* any of the following versions: +* +* * Version 3.1. The GL_ARB_compatibility extension may or may not +* be implemented, as determined by the implementation. +* * The core profile of version 3.2 or greater. +* +* and because Mesa doesn't support GL_ARB_compatibility, the only chance to +* honour a 3.1 context is through core profile. */ attribs.profile = ST_PROFILE_DEFAULT; - if ((majorVersion 3 || (majorVersion == 3 minorVersion = 2)) -((profileMask WGL_CONTEXT_COMPATIBILITY_PROFILE_BIT_ARB) == 0)) + if (((majorVersion 3 || (majorVersion == 3 minorVersion = 2)) + ((profileMask WGL_CONTEXT_COMPATIBILITY_PROFILE_BIT_ARB) == 0)) || + (majorVersion == 3 minorVersion == 1)) attribs.profile = ST_PROFILE_OPENGL_CORE; ctx-st = stw_dev-stapi-create_context(stw_dev-stapi, -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] st/wgl: Advertise WGL_ARB_create_context(_profile).
From: José Fonseca jfons...@vmware.com We added wglCreateContextAttribsARB but not the extension strings. This allows creation of GL 3.x contents. --- src/gallium/state_trackers/wgl/stw_ext_extensionsstring.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/src/gallium/state_trackers/wgl/stw_ext_extensionsstring.c b/src/gallium/state_trackers/wgl/stw_ext_extensionsstring.c index 566f78c..06a152b 100644 --- a/src/gallium/state_trackers/wgl/stw_ext_extensionsstring.c +++ b/src/gallium/state_trackers/wgl/stw_ext_extensionsstring.c @@ -35,6 +35,8 @@ static const char *stw_extension_string = + WGL_ARB_create_context + WGL_ARB_create_context_profile WGL_ARB_extensions_string WGL_ARB_multisample WGL_ARB_pbuffer -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 0/8] Radeon various patches
This patch set mostly contains cosmetic changes that I made while adding support for sample shading. Marek Olšák (8): r600g: simplify framebuffer state size computation radeonsi: use DRAW_PREAMBLE on CIK radeonsi: remove unused variable exports_ps in si_pipe_shader_ps radeonsi: only count CS space for state atoms if we're going to draw radeonsi: add and use a helper function for loading constants radeon/llvm: add support for non-scalar system values radeonsi: simplify depth/stencil export code radeonsi: prepare depth export registers at compile time Please review. Marek ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/8] r600g: simplify framebuffer state size computation
From: Marek Olšák marek.ol...@amd.com Take the upper bound. The number doesn't have to absolutely correct, only safe. --- src/gallium/drivers/r600/evergreen_state.c | 30 -- 1 file changed, 4 insertions(+), 26 deletions(-) diff --git a/src/gallium/drivers/r600/evergreen_state.c b/src/gallium/drivers/r600/evergreen_state.c index f7a63a8..7b1a44b 100644 --- a/src/gallium/drivers/r600/evergreen_state.c +++ b/src/gallium/drivers/r600/evergreen_state.c @@ -1394,32 +1394,10 @@ static void evergreen_set_framebuffer_state(struct pipe_context *ctx, rctx-framebuffer.atom.num_dw = 4; /* SCISSOR */ /* MSAA. */ - if (rctx-b.chip_class == EVERGREEN) { - switch (rctx-framebuffer.nr_samples) { - case 2: - case 4: - rctx-framebuffer.atom.num_dw += 6; - break; - case 8: - rctx-framebuffer.atom.num_dw += 10; - break; - } - rctx-framebuffer.atom.num_dw += 4; - } else { - switch (rctx-framebuffer.nr_samples) { - case 2: - case 4: - rctx-framebuffer.atom.num_dw += 12; - break; - case 8: - rctx-framebuffer.atom.num_dw += 16; - break; - case 16: - rctx-framebuffer.atom.num_dw += 18; - break; - } - rctx-framebuffer.atom.num_dw += 7; - } + if (rctx-b.chip_class == EVERGREEN) + rctx-framebuffer.atom.num_dw += 14; /* Evergreen */ + else + rctx-framebuffer.atom.num_dw += 25; /* Cayman */ /* Colorbuffers. */ rctx-framebuffer.atom.num_dw += state-nr_cbufs * 23; -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 6/8] radeon/llvm: add support for non-scalar system values
From: Marek Olšák marek.ol...@amd.com The sample position is one of them. --- src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c b/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c index 60ade78..f8be0df 100644 --- a/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c +++ b/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c @@ -218,7 +218,13 @@ static LLVMValueRef fetch_system_value( unsigned swizzle) { struct radeon_llvm_context * ctx = radeon_llvm_context(bld_base); + struct gallivm_state *gallivm = bld_base-base.gallivm; + LLVMValueRef cval = ctx-system_values[reg-Register.Index]; + if (LLVMGetTypeKind(LLVMTypeOf(cval)) == LLVMVectorTypeKind) { + cval = LLVMBuildExtractElement(gallivm-builder, cval, + lp_build_const_int32(gallivm, swizzle), ); + } return bitcast(bld_base, type, cval); } -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/8] radeonsi: use DRAW_PREAMBLE on CIK
From: Marek Olšák marek.ol...@amd.com It's the same as setting the 3 regs separately, but shorter, and it also seems to be required on GFX7.2 and later. This doesn't fix Hawaii. --- src/gallium/drivers/radeonsi/si_state_draw.c | 13 - src/gallium/drivers/radeonsi/sid.h | 1 + 2 files changed, 9 insertions(+), 5 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c b/src/gallium/drivers/radeonsi/si_state_draw.c index bc69c94..315998c 100644 --- a/src/gallium/drivers/radeonsi/si_state_draw.c +++ b/src/gallium/drivers/radeonsi/si_state_draw.c @@ -426,15 +426,18 @@ static bool si_update_draw_info_state(struct si_context *sctx, /* If the WD switch is false, the IA switch must be false too. */ bool ia_switch_on_eop = wd_switch_on_eop; - si_pm4_set_reg(pm4, R_028AA8_IA_MULTI_VGT_PARAM, + si_pm4_set_reg(pm4, R_028B74_VGT_DISPATCH_DRAW_INDEX, + ib-index_size == 4 ? 0xFC00 : 0xFC00); + + si_pm4_cmd_begin(pm4, PKT3_DRAW_PREAMBLE); + si_pm4_cmd_add(pm4, prim); /* VGT_PRIMITIVE_TYPE */ + si_pm4_cmd_add(pm4, /* IA_MULTI_VGT_PARAM */ S_028AA8_SWITCH_ON_EOP(ia_switch_on_eop) | S_028AA8_PARTIAL_VS_WAVE_ON(1) | S_028AA8_PRIMGROUP_SIZE(63) | S_028AA8_WD_SWITCH_ON_EOP(wd_switch_on_eop)); - si_pm4_set_reg(pm4, R_028B74_VGT_DISPATCH_DRAW_INDEX, - ib-index_size == 4 ? 0xFC00 : 0xFC00); - - si_pm4_set_reg(pm4, R_030908_VGT_PRIMITIVE_TYPE, prim); + si_pm4_cmd_add(pm4, 0); /* VGT_LS_HS_CONFIG */ + si_pm4_cmd_end(pm4, false); } else { si_pm4_set_reg(pm4, R_008958_VGT_PRIMITIVE_TYPE, prim); } diff --git a/src/gallium/drivers/radeonsi/sid.h b/src/gallium/drivers/radeonsi/sid.h index 5d6da1f..e42d804 100644 --- a/src/gallium/drivers/radeonsi/sid.h +++ b/src/gallium/drivers/radeonsi/sid.h @@ -93,6 +93,7 @@ #define PKT3_INDIRECT_BUFFER 0x32 #define PKT3_STRMOUT_BUFFER_UPDATE 0x34 #define PKT3_DRAW_INDEX_OFFSET_2 0x35 +#define PKT3_DRAW_PREAMBLE 0x36 /* new on CIK, required on GFX7.2 and later */ #define PKT3_WRITE_DATA0x37 #define PKT3_WRITE_DATA_DST_SEL(x) ((x) 8) #define PKT3_WRITE_DATA_DST_SEL_REG0 -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 5/8] radeonsi: add and use a helper function for loading constants
From: Marek Olšák marek.ol...@amd.com --- src/gallium/drivers/radeonsi/si_shader.c | 38 1 file changed, 19 insertions(+), 19 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_shader.c b/src/gallium/drivers/radeonsi/si_shader.c index 6c0cba7..0ecd317 100644 --- a/src/gallium/drivers/radeonsi/si_shader.c +++ b/src/gallium/drivers/radeonsi/si_shader.c @@ -531,6 +531,15 @@ static void declare_input_fs( } } +static LLVMValueRef load_const(LLVMBuilderRef builder, LLVMValueRef resource, + LLVMValueRef offset, LLVMTypeRef return_type) +{ + LLVMValueRef args[2] = {resource, offset}; + + return build_intrinsic(builder, llvm.SI.load.const, return_type, args, 2, + LLVMReadNoneAttribute | LLVMNoUnwindAttribute); +} + static void declare_system_value( struct radeon_llvm_context * radeon_bld, unsigned index, @@ -570,7 +579,6 @@ static LLVMValueRef fetch_constant( const struct tgsi_ind_register *ireg = reg-Indirect; unsigned buf, idx; - LLVMValueRef args[2]; LLVMValueRef addr; LLVMValueRef result; @@ -589,15 +597,14 @@ static LLVMValueRef fetch_constant( if (!reg-Register.Indirect) return bitcast(bld_base, type, si_shader_ctx-constants[buf][idx]); - args[0] = si_shader_ctx-const_resource[buf]; - args[1] = lp_build_const_int32(base-gallivm, idx * 4); addr = si_shader_ctx-radeon_bld.soa.addr[ireg-Index][ireg-Swizzle]; addr = LLVMBuildLoad(base-gallivm-builder, addr, load addr reg); addr = lp_build_mul_imm(bld_base-uint_bld, addr, 16); - args[1] = lp_build_add(bld_base-uint_bld, addr, args[1]); + addr = lp_build_add(bld_base-uint_bld, addr, + lp_build_const_int32(base-gallivm, idx * 4)); - result = build_intrinsic(base-gallivm-builder, llvm.SI.load.const, base-elem_type, - args, 2, LLVMReadNoneAttribute | LLVMNoUnwindAttribute); + result = load_const(base-gallivm-builder, si_shader_ctx-const_resource[buf], + addr, base-elem_type); return bitcast(bld_base, type, result); } @@ -763,15 +770,11 @@ static void si_llvm_emit_clipvertex(struct lp_build_tgsi_context * bld_base, /* Compute dot products of position and user clip plane vectors */ for (chan = 0; chan TGSI_NUM_CHANNELS; chan++) { for (const_chan = 0; const_chan TGSI_NUM_CHANNELS; const_chan++) { - args[0] = const_resource; args[1] = lp_build_const_int32(base-gallivm, ((reg_index * 4 + chan) * 4 + const_chan) * 4); - base_elt = build_intrinsic(base-gallivm-builder, - llvm.SI.load.const, - base-elem_type, - args, 2, - LLVMReadNoneAttribute | LLVMNoUnwindAttribute); + base_elt = load_const(base-gallivm-builder, const_resource, + args[1], base-elem_type); args[5 + chan] = lp_build_add(base, args[5 + chan], lp_build_mul(base, base_elt, @@ -2252,14 +2255,11 @@ static void preload_constants(struct si_shader_context *si_shader_ctx) /* Load the constants, we rely on the code sinking to do the rest */ for (i = 0; i num_const * 4; ++i) { - LLVMValueRef args[2] = { - si_shader_ctx-const_resource[buf], - lp_build_const_int32(gallivm, i * 4) - }; si_shader_ctx-constants[buf][i] = - build_intrinsic(gallivm-builder, llvm.SI.load.const, - bld_base-base.elem_type, args, 2, - LLVMReadNoneAttribute | LLVMNoUnwindAttribute); + load_const(gallivm-builder, + si_shader_ctx-const_resource[buf], + lp_build_const_int32(gallivm, i * 4), + bld_base-base.elem_type); } } } -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/8] radeonsi: remove unused variable exports_ps in si_pipe_shader_ps
From: Marek Olšák marek.ol...@amd.com --- src/gallium/drivers/radeonsi/si_state_draw.c | 13 + 1 file changed, 1 insertion(+), 12 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c b/src/gallium/drivers/radeonsi/si_state_draw.c index 315998c..fce799c 100644 --- a/src/gallium/drivers/radeonsi/si_state_draw.c +++ b/src/gallium/drivers/radeonsi/si_state_draw.c @@ -231,7 +231,7 @@ static void si_pipe_shader_ps(struct pipe_context *ctx, struct si_pipe_shader *s { struct si_context *sctx = (struct si_context *)ctx; struct si_pm4_state *pm4; - unsigned i, exports_ps, spi_ps_in_control, db_shader_control; + unsigned i, spi_ps_in_control, db_shader_control; unsigned num_sgprs, num_user_sgprs; unsigned spi_baryc_cntl = 0, spi_ps_input_ena, spi_shader_z_format; uint64_t va; @@ -273,17 +273,6 @@ static void si_pipe_shader_ps(struct pipe_context *ctx, struct si_pipe_shader *s if (shader-shader.uses_kill || shader-key.ps.alpha_func != PIPE_FUNC_ALWAYS) db_shader_control |= S_02880C_KILL_ENABLE(1); - exports_ps = 0; - for (i = 0; i shader-shader.noutput; i++) { - if (shader-shader.output[i].name == TGSI_SEMANTIC_POSITION || - shader-shader.output[i].name == TGSI_SEMANTIC_STENCIL) - exports_ps |= 1; - } - if (!exports_ps) { - /* always at least export 1 component per pixel */ - exports_ps = 2; - } - spi_ps_in_control = S_0286D8_NUM_INTERP(shader-shader.nparam) | S_0286D8_BC_OPTIMIZE_DISABLE(1); -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 7/8] radeonsi: simplify depth/stencil export code
From: Marek Olšák marek.ol...@amd.com --- src/gallium/drivers/radeonsi/si_shader.c | 16 +--- 1 file changed, 5 insertions(+), 11 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_shader.c b/src/gallium/drivers/radeonsi/si_shader.c index 0ecd317..363218f 100644 --- a/src/gallium/drivers/radeonsi/si_shader.c +++ b/src/gallium/drivers/radeonsi/si_shader.c @@ -1406,30 +1406,24 @@ static void si_llvm_emit_fs_epilogue(struct lp_build_tgsi_context * bld_base) /* Specify the target we are exporting */ args[3] = lp_build_const_int32(base-gallivm, V_008DFC_SQ_EXP_MRTZ); + args[5] = base-zero; /* R, depth */ + args[6] = base-zero; /* G, stencil test value[0:7], stencil op value[8:15] */ + args[7] = base-zero; /* B, sample mask */ + args[8] = base-zero; /* A, alpha to mask */ + if (depth_index = 0) { out_ptr = si_shader_ctx-radeon_bld.soa.outputs[depth_index][2]; args[5] = LLVMBuildLoad(base-gallivm-builder, out_ptr, ); mask |= 0x1; - - if (stencil_index 0) { - args[6] = - args[7] = - args[8] = args[5]; - } } if (stencil_index = 0) { out_ptr = si_shader_ctx-radeon_bld.soa.outputs[stencil_index][1]; - args[7] = - args[8] = args[6] = LLVMBuildLoad(base-gallivm-builder, out_ptr, ); /* Only setting the stencil component bit (0x2) here * breaks some stencil piglit tests */ mask |= 0x3; - - if (depth_index 0) - args[5] = args[6]; } /* Specify which components to enable */ -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 4/8] radeonsi: only count CS space for state atoms if we're going to draw
From: Marek Olšák marek.ol...@amd.com --- src/gallium/drivers/radeonsi/si_hw_context.c | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_hw_context.c b/src/gallium/drivers/radeonsi/si_hw_context.c index 383157b..aa854d6 100644 --- a/src/gallium/drivers/radeonsi/si_hw_context.c +++ b/src/gallium/drivers/radeonsi/si_hw_context.c @@ -35,13 +35,13 @@ void si_need_cs_space(struct si_context *ctx, unsigned num_dw, /* The number of dwords we already used in the CS so far. */ num_dw += ctx-b.rings.gfx.cs-cdw; - for (i = 0; i SI_NUM_ATOMS(ctx); i++) { - if (ctx-atoms.array[i]-dirty) { - num_dw += ctx-atoms.array[i]-num_dw; + if (count_draw_in) { + for (i = 0; i SI_NUM_ATOMS(ctx); i++) { + if (ctx-atoms.array[i]-dirty) { + num_dw += ctx-atoms.array[i]-num_dw; + } } - } - if (count_draw_in) { /* The number of dwords all the dirty states would take. */ num_dw += ctx-pm4_dirty_cdwords; -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 78393] New: Black zebra like lines while playing games on open source drivers
https://bugs.freedesktop.org/show_bug.cgi?id=78393 Priority: medium Bug ID: 78393 Assignee: mesa-dev@lists.freedesktop.org Summary: Black zebra like lines while playing games on open source drivers Severity: normal Classification: Unclassified OS: All Reporter: incarnated...@gmail.com Hardware: Other Status: NEW Version: unspecified Component: Mesa core Product: Mesa Created attachment 98627 -- https://bugs.freedesktop.org/attachment.cgi?id=98627action=edit Here is what I am talking about When I go to play guild wars 2 on open drivers everything works fine until I load up the world. The entire map looks like a zebra (especially in the snowy areas). Now when I switch to proprietary drivers like nividia's drivers then the lines go away. This happens on nividia gtx 460, radeon 5750' intel HD 3000 graphic cards. -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 5/8] radeonsi: implement SAMPLEPOS fragment shader input
From: Marek Olšák marek.ol...@amd.com The sample positions are read from a constant buffer. --- src/gallium/drivers/radeon/cayman_msaa.c | 17 + src/gallium/drivers/radeon/r600_pipe_common.c | 1 + src/gallium/drivers/radeon/r600_pipe_common.h | 10 ++ src/gallium/drivers/radeonsi/si_shader.c | 23 +++ src/gallium/drivers/radeonsi/si_state.c | 25 + 5 files changed, 76 insertions(+) diff --git a/src/gallium/drivers/radeon/cayman_msaa.c b/src/gallium/drivers/radeon/cayman_msaa.c index 8727f3e..47fc5c4 100644 --- a/src/gallium/drivers/radeon/cayman_msaa.c +++ b/src/gallium/drivers/radeon/cayman_msaa.c @@ -123,6 +123,23 @@ void cayman_get_sample_position(struct pipe_context *ctx, unsigned sample_count, } } +void cayman_init_msaa(struct pipe_context *ctx) +{ + struct r600_common_context *rctx = (struct r600_common_context*)ctx; + int i; + + cayman_get_sample_position(ctx, 1, 0, rctx-sample_locations_1x[0]); + + for (i = 0; i 2; i++) + cayman_get_sample_position(ctx, 2, i, rctx-sample_locations_2x[i]); + for (i = 0; i 4; i++) + cayman_get_sample_position(ctx, 4, i, rctx-sample_locations_4x[i]); + for (i = 0; i 8; i++) + cayman_get_sample_position(ctx, 8, i, rctx-sample_locations_8x[i]); + for (i = 0; i 16; i++) + cayman_get_sample_position(ctx, 16, i, rctx-sample_locations_16x[i]); +} + void cayman_emit_msaa_sample_locs(struct radeon_winsys_cs *cs, int nr_samples) { switch (nr_samples) { diff --git a/src/gallium/drivers/radeon/r600_pipe_common.c b/src/gallium/drivers/radeon/r600_pipe_common.c index 70c4d1a..4c6cf0e 100644 --- a/src/gallium/drivers/radeon/r600_pipe_common.c +++ b/src/gallium/drivers/radeon/r600_pipe_common.c @@ -154,6 +154,7 @@ bool r600_common_context_init(struct r600_common_context *rctx, r600_init_context_texture_functions(rctx); r600_streamout_init(rctx); r600_query_init(rctx); + cayman_init_msaa(rctx-b); rctx-allocator_so_filled_size = u_suballocator_create(rctx-b, 4096, 4, 0, PIPE_USAGE_DEFAULT, TRUE); diff --git a/src/gallium/drivers/radeon/r600_pipe_common.h b/src/gallium/drivers/radeon/r600_pipe_common.h index 4c3e9ce..8862d31 100644 --- a/src/gallium/drivers/radeon/r600_pipe_common.h +++ b/src/gallium/drivers/radeon/r600_pipe_common.h @@ -357,6 +357,15 @@ struct r600_common_context { boolean saved_render_cond_cond; unsignedsaved_render_cond_mode; + /* MSAA sample locations. +* The first index is the sample index. +* The second index is the coordinate: X, Y. */ + float sample_locations_1x[1][2]; + float sample_locations_2x[2][2]; + float sample_locations_4x[4][2]; + float sample_locations_8x[8][2]; + float sample_locations_16x[16][2]; + /* Copy one resource to another using async DMA. */ void (*dma_copy)(struct pipe_context *ctx, struct pipe_resource *dst, @@ -472,6 +481,7 @@ extern const uint32_t eg_sample_locs_4x[4]; extern const unsigned eg_max_dist_4x; void cayman_get_sample_position(struct pipe_context *ctx, unsigned sample_count, unsigned sample_index, float *out_value); +void cayman_init_msaa(struct pipe_context *ctx); void cayman_emit_msaa_sample_locs(struct radeon_winsys_cs *cs, int nr_samples); void cayman_emit_msaa_config(struct radeon_winsys_cs *cs, int nr_samples, int ps_iter_samples); diff --git a/src/gallium/drivers/radeonsi/si_shader.c b/src/gallium/drivers/radeonsi/si_shader.c index 0195e54..d8b7b9c 100644 --- a/src/gallium/drivers/radeonsi/si_shader.c +++ b/src/gallium/drivers/radeonsi/si_shader.c @@ -559,6 +559,8 @@ static void declare_system_value( { struct si_shader_context *si_shader_ctx = si_shader_context(radeon_bld-soa.bld_base); + struct lp_build_context *uint_bld = radeon_bld-soa.bld_base.uint_bld; + struct gallivm_state *gallivm = radeon_bld-gallivm; LLVMValueRef value = 0; switch (decl-Semantic.Name) { @@ -576,6 +578,27 @@ static void declare_system_value( value = get_sample_id(radeon_bld); break; + case TGSI_SEMANTIC_SAMPLEPOS: + { + LLVMBuilderRef builder = gallivm-builder; + LLVMValueRef desc = LLVMGetParam(si_shader_ctx-radeon_bld.main_fn, SI_PARAM_CONST); + LLVMValueRef buf_index = lp_build_const_int32(gallivm, NUM_PIPE_CONST_BUFFERS); + LLVMValueRef resource = build_indexed_load(si_shader_ctx, desc, buf_index); + +
[Mesa-dev] [PATCH 6/8] radeonsi: interpolate varyings at sample when full sample shading is enabled
From: Marek Olšák marek.ol...@amd.com --- src/gallium/drivers/radeonsi/si_shader.c | 24 src/gallium/drivers/radeonsi/si_shader.h | 1 + src/gallium/drivers/radeonsi/si_state.c | 2 ++ 3 files changed, 15 insertions(+), 12 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_shader.c b/src/gallium/drivers/radeonsi/si_shader.c index d8b7b9c..5bd3057 100644 --- a/src/gallium/drivers/radeonsi/si_shader.c +++ b/src/gallium/drivers/radeonsi/si_shader.c @@ -423,27 +423,27 @@ static void declare_input_fs( shader-input[input_index].param_offset); switch (decl-Interp.Interpolate) { - case TGSI_INTERPOLATE_COLOR: - if (si_shader_ctx-shader-key.ps.flatshade) { - interp_param = 0; - } else { - if (decl-Interp.Centroid) - interp_param = LLVMGetParam(main_fn, SI_PARAM_PERSP_CENTROID); - else - interp_param = LLVMGetParam(main_fn, SI_PARAM_PERSP_CENTER); - } - break; case TGSI_INTERPOLATE_CONSTANT: interp_param = 0; break; case TGSI_INTERPOLATE_LINEAR: - if (decl-Interp.Centroid) + if (si_shader_ctx-shader-key.ps.interp_at_sample) + interp_param = LLVMGetParam(main_fn, SI_PARAM_LINEAR_SAMPLE); + else if (decl-Interp.Centroid) interp_param = LLVMGetParam(main_fn, SI_PARAM_LINEAR_CENTROID); else interp_param = LLVMGetParam(main_fn, SI_PARAM_LINEAR_CENTER); break; + case TGSI_INTERPOLATE_COLOR: + if (si_shader_ctx-shader-key.ps.flatshade) { + interp_param = 0; + break; + } + /* fall through to perspective */ case TGSI_INTERPOLATE_PERSPECTIVE: - if (decl-Interp.Centroid) + if (si_shader_ctx-shader-key.ps.interp_at_sample) + interp_param = LLVMGetParam(main_fn, SI_PARAM_PERSP_SAMPLE); + else if (decl-Interp.Centroid) interp_param = LLVMGetParam(main_fn, SI_PARAM_PERSP_CENTROID); else interp_param = LLVMGetParam(main_fn, SI_PARAM_PERSP_CENTER); diff --git a/src/gallium/drivers/radeonsi/si_shader.h b/src/gallium/drivers/radeonsi/si_shader.h index 82382ec..14b5187 100644 --- a/src/gallium/drivers/radeonsi/si_shader.h +++ b/src/gallium/drivers/radeonsi/si_shader.h @@ -160,6 +160,7 @@ union si_shader_key { unsignedcolor_two_side:1; unsignedalpha_func:3; unsignedflatshade:1; + unsignedinterp_at_sample:1; unsignedalpha_to_one:1; } ps; struct { diff --git a/src/gallium/drivers/radeonsi/si_state.c b/src/gallium/drivers/radeonsi/si_state.c index 3c1af06..09bf947 100644 --- a/src/gallium/drivers/radeonsi/si_state.c +++ b/src/gallium/drivers/radeonsi/si_state.c @@ -2115,6 +2115,8 @@ static INLINE void si_shader_selector_key(struct pipe_context *ctx, if (sctx-queued.named.rasterizer) { key-ps.color_two_side = sctx-queued.named.rasterizer-two_side; key-ps.flatshade = sctx-queued.named.rasterizer-flatshade; + key-ps.interp_at_sample = sctx-framebuffer.nr_samples 1 + sctx-ps_iter_samples == sctx-framebuffer.nr_samples; if (sctx-queued.named.blend) { key-ps.alpha_to_one = sctx-queued.named.blend-alpha_to_one -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/8] radeon: add basic register setup for per-sample shading
From: Marek Olšák marek.ol...@amd.com Only for Cayman, SI, CIK. --- src/gallium/drivers/r600/evergreen_state.c | 6 ++ src/gallium/drivers/radeon/cayman_msaa.c | 7 ++- src/gallium/drivers/radeon/r600d_common.h | 3 +++ src/gallium/drivers/radeonsi/si_state.c| 6 ++ 4 files changed, 13 insertions(+), 9 deletions(-) diff --git a/src/gallium/drivers/r600/evergreen_state.c b/src/gallium/drivers/r600/evergreen_state.c index 03dade0..3a3194b 100644 --- a/src/gallium/drivers/r600/evergreen_state.c +++ b/src/gallium/drivers/r600/evergreen_state.c @@ -1397,7 +1397,7 @@ static void evergreen_set_framebuffer_state(struct pipe_context *ctx, if (rctx-b.chip_class == EVERGREEN) rctx-framebuffer.atom.num_dw += 14; /* Evergreen */ else - rctx-framebuffer.atom.num_dw += 25; /* Cayman */ + rctx-framebuffer.atom.num_dw += 28; /* Cayman */ /* Colorbuffers. */ rctx-framebuffer.atom.num_dw += state-nr_cbufs * 23; @@ -1670,7 +1670,7 @@ static void evergreen_emit_framebuffer_state(struct r600_context *rctx, struct r evergreen_emit_msaa_state(rctx, rctx-framebuffer.nr_samples); } else { cayman_emit_msaa_sample_locs(cs, rctx-framebuffer.nr_samples); - cayman_emit_msaa_config(cs, rctx-framebuffer.nr_samples); + cayman_emit_msaa_config(cs, rctx-framebuffer.nr_samples, 1); } } @@ -2172,8 +2172,6 @@ void cayman_init_common_regs(struct r600_command_buffer *cb, r600_store_config_reg(cb, R_008D8C_SQ_DYN_GPR_CNTL_PS_FLUSH_REQ, (1 8)); - r600_store_context_reg(cb, R_028A4C_PA_SC_MODE_CNTL_1, 0); - r600_store_context_reg_seq(cb, R_028350_SX_MISC, 2); r600_store_value(cb, 0); r600_store_value(cb, S_028354_SURFACE_SYNC_MASK(0xf)); diff --git a/src/gallium/drivers/radeon/cayman_msaa.c b/src/gallium/drivers/radeon/cayman_msaa.c index fa7deb6..8727f3e 100644 --- a/src/gallium/drivers/radeon/cayman_msaa.c +++ b/src/gallium/drivers/radeon/cayman_msaa.c @@ -191,6 +191,8 @@ void cayman_emit_msaa_config(struct radeon_winsys_cs *cs, int nr_samples, }; unsigned log_samples = util_logbase2(nr_samples); + unsigned log_ps_iter_samples = + util_logbase2(util_next_power_of_two(ps_iter_samples)); r600_write_context_reg_seq(cs, CM_R_028BDC_PA_SC_LINE_CNTL, 2); radeon_emit(cs, S_028BDC_LAST_PIXEL(1) | @@ -201,11 +203,13 @@ void cayman_emit_msaa_config(struct radeon_winsys_cs *cs, int nr_samples, r600_write_context_reg(cs, CM_R_028804_DB_EQAA, S_028804_MAX_ANCHOR_SAMPLES(log_samples) | - S_028804_PS_ITER_SAMPLES(log_samples) | + S_028804_PS_ITER_SAMPLES(log_ps_iter_samples) | S_028804_MASK_EXPORT_NUM_SAMPLES(log_samples) | S_028804_ALPHA_TO_MASK_NUM_SAMPLES(log_samples) | S_028804_HIGH_QUALITY_INTERSECTIONS(1) | S_028804_STATIC_ANCHOR_ASSOCIATIONS(1)); + r600_write_context_reg(cs, EG_R_028A4C_PA_SC_MODE_CNTL_1, +EG_S_028A4C_PS_ITER_SAMPLE(ps_iter_samples 1)); } else { r600_write_context_reg_seq(cs, CM_R_028BDC_PA_SC_LINE_CNTL, 2); radeon_emit(cs, S_028BDC_LAST_PIXEL(1)); /* CM_R_028BDC_PA_SC_LINE_CNTL */ @@ -214,5 +218,6 @@ void cayman_emit_msaa_config(struct radeon_winsys_cs *cs, int nr_samples, r600_write_context_reg(cs, CM_R_028804_DB_EQAA, S_028804_HIGH_QUALITY_INTERSECTIONS(1) | S_028804_STATIC_ANCHOR_ASSOCIATIONS(1)); + r600_write_context_reg(cs, EG_R_028A4C_PA_SC_MODE_CNTL_1, 0); } } diff --git a/src/gallium/drivers/radeon/r600d_common.h b/src/gallium/drivers/radeon/r600d_common.h index 1172af0..fa6131f 100644 --- a/src/gallium/drivers/radeon/r600d_common.h +++ b/src/gallium/drivers/radeon/r600d_common.h @@ -160,6 +160,9 @@ #define G_028B98_STREAM_3_BUFFER_EN(x) (((x) 12) 0x0F) #define C_028B98_STREAM_3_BUFFER_EN 0x0FFF +#define EG_R_028A4C_PA_SC_MODE_CNTL_10x028A4C +#define EG_S_028A4C_PS_ITER_SAMPLE(x) (((x) 0x1) 16) + #define CM_R_028804_DB_EQAA 0x00028804 #define S_028804_MAX_ANCHOR_SAMPLES(x) (((x) 0x7) 0) #define S_028804_PS_ITER_SAMPLES(x) (((x) 0x7) 4) diff --git a/src/gallium/drivers/radeonsi/si_state.c b/src/gallium/drivers/radeonsi/si_state.c index 4cca2cc..38a2acc 100644 --- a/src/gallium/drivers/radeonsi/si_state.c +++
[Mesa-dev] [PATCH 0/8] Almost-done ARB_sample_shading
This series adds support for ARB_sample_shading. The problem is that enabling the SAMPLEMASK fragment shader output by setting DB_SHADER_CONTROL.MASK_EXPORT_ENABLE hangs the GPU. The output is enabled in the same way as the depth and stencil outputs. If anybody has an idea about what I'm doing wrong, please let me know. Everything else works. Marek Olšák (8): radeon: split cayman_emit_msaa_state into 2 functions radeon: add basic register setup for per-sample shading radeonsi: implement set_min_samples radeonsi: implement SAMPLEID fragment shader input radeonsi: implement SAMPLEPOS fragment shader input radeonsi: interpolate varyings at sample when full sample shading is enabled radeonsi: implement SAMPLEMASK fragment shader output radeonsi: enable ARB_sample_shading Please review. Marek ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/8] radeonsi: implement set_min_samples
From: Marek Olšák marek.ol...@amd.com This is how per-sample shading is enabled. --- src/gallium/drivers/radeonsi/si_pipe.c | 2 ++ src/gallium/drivers/radeonsi/si_pipe.h | 4 src/gallium/drivers/radeonsi/si_state.c | 30 -- src/gallium/drivers/radeonsi/si_state.h | 1 + 4 files changed, 35 insertions(+), 2 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_pipe.c b/src/gallium/drivers/radeonsi/si_pipe.c index 22cd5b9..9fc1ea6 100644 --- a/src/gallium/drivers/radeonsi/si_pipe.c +++ b/src/gallium/drivers/radeonsi/si_pipe.c @@ -106,6 +106,8 @@ static struct pipe_context *si_create_context(struct pipe_screen *screen, void * /* Initialize cache_flush. */ sctx-cache_flush = si_atom_cache_flush; sctx-atoms.cache_flush = sctx-cache_flush; + sctx-msaa_config = si_atom_msaa_config; + sctx-atoms.msaa_config = sctx-msaa_config; sctx-atoms.streamout_begin = sctx-b.streamout.begin_atom; sctx-atoms.streamout_enable = sctx-b.streamout.enable_atom; diff --git a/src/gallium/drivers/radeonsi/si_pipe.h b/src/gallium/drivers/radeonsi/si_pipe.h index a74bbcf..c3b7fb6 100644 --- a/src/gallium/drivers/radeonsi/si_pipe.h +++ b/src/gallium/drivers/radeonsi/si_pipe.h @@ -110,6 +110,7 @@ struct si_context { struct r600_atom *streamout_begin; struct r600_atom *streamout_enable; /* must be after streamout_begin */ struct r600_atom *framebuffer; + struct r600_atom *msaa_config; }; struct r600_atom *array[0]; } atoms; @@ -132,6 +133,9 @@ struct si_context { struct r600_resource*border_color_table; unsignedborder_color_offset; + struct r600_atommsaa_config; + int ps_iter_samples; + unsigned default_ps_gprs, default_vs_gprs; /* Below are variables from the old r600_context. diff --git a/src/gallium/drivers/radeonsi/si_state.c b/src/gallium/drivers/radeonsi/si_state.c index 38a2acc..fd9de05 100644 --- a/src/gallium/drivers/radeonsi/si_state.c +++ b/src/gallium/drivers/radeonsi/si_state.c @@ -1921,8 +1921,9 @@ static void si_set_framebuffer_state(struct pipe_context *ctx, sctx-framebuffer.atom.num_dw = state-nr_cbufs*15 + (8 - state-nr_cbufs)*3; sctx-framebuffer.atom.num_dw += state-zsbuf ? 23 : 4; sctx-framebuffer.atom.num_dw += 3; /* WINDOW_SCISSOR_BR */ - sctx-framebuffer.atom.num_dw += 28; /* MSAA */ + sctx-framebuffer.atom.num_dw += 18; /* MSAA sample locations */ sctx-framebuffer.atom.dirty = true; + sctx-msaa_config.dirty = true; } static void si_emit_framebuffer_state(struct si_context *sctx, struct r600_atom *atom) @@ -2026,7 +2027,30 @@ static void si_emit_framebuffer_state(struct si_context *sctx, struct r600_atom S_028208_BR_X(state-width) | S_028208_BR_Y(state-height)); cayman_emit_msaa_sample_locs(cs, sctx-framebuffer.nr_samples); - cayman_emit_msaa_config(cs, sctx-framebuffer.nr_samples, 1); +} + +static void si_emit_msaa_config(struct r600_common_context *rctx, struct r600_atom *atom) +{ + struct si_context *sctx = (struct si_context *)rctx; + struct radeon_winsys_cs *cs = sctx-b.rings.gfx.cs; + + cayman_emit_msaa_config(cs, sctx-framebuffer.nr_samples, + sctx-ps_iter_samples); +} + +const struct r600_atom si_atom_msaa_config = { si_emit_msaa_config, 10 }; /* number of CS dwords */ + +static void si_set_min_samples(struct pipe_context *ctx, unsigned min_samples) +{ + struct si_context *sctx = (struct si_context *)ctx; + + if (sctx-ps_iter_samples == min_samples) + return; + + sctx-ps_iter_samples = min_samples; + + if (sctx-framebuffer.nr_samples 1) + sctx-msaa_config.dirty = true; } /* @@ -3026,6 +3050,8 @@ void si_init_state_functions(struct si_context *sctx) sctx-b.b.texture_barrier = si_texture_barrier; sctx-b.b.set_polygon_stipple = si_set_polygon_stipple; + sctx-b.b.set_min_samples = si_set_min_samples; + sctx-b.dma_copy = si_dma_copy; sctx-b.set_occlusion_query_state = si_set_occlusion_query_state; sctx-b.need_gfx_cs_space = si_need_gfx_cs_space; diff --git a/src/gallium/drivers/radeonsi/si_state.h b/src/gallium/drivers/radeonsi/si_state.h index 0e0e480..e0fd2ff 100644 --- a/src/gallium/drivers/radeonsi/si_state.h +++ b/src/gallium/drivers/radeonsi/si_state.h @@ -239,6 +239,7 @@ unsigned si_tile_mode_index(struct r600_texture *rtex, unsigned level, bool sten /* si_state_draw.c */ extern const struct r600_atom si_atom_cache_flush; +extern const struct r600_atom si_atom_msaa_config; void si_emit_cache_flush(struct r600_common_context *sctx, struct r600_atom *atom); void si_draw_vbo(struct
[Mesa-dev] [PATCH 8/8] radeonsi: enable ARB_sample_shading
From: Marek Olšák marek.ol...@amd.com --- docs/GL3.txt | 2 +- docs/relnotes/10.3.html| 1 + src/gallium/drivers/radeonsi/si_pipe.c | 2 +- 3 files changed, 3 insertions(+), 2 deletions(-) diff --git a/docs/GL3.txt b/docs/GL3.txt index ea16e3f..0ed26bf 100644 --- a/docs/GL3.txt +++ b/docs/GL3.txt @@ -114,7 +114,7 @@ GL 4.0: - Interpolation functionsstarted - New overload resolution rules not started GL_ARB_gpu_shader_fp64 not started - GL_ARB_sample_shadingDONE (i965, nv50, nvc0) + GL_ARB_sample_shadingDONE (i965, nv50, nvc0, radeonsi) GL_ARB_shader_subroutine not started GL_ARB_tessellation_shader not started GL_ARB_texture_buffer_object_rgb32 DONE (i965, nvc0, r600, radeonsi, softpipe) diff --git a/docs/relnotes/10.3.html b/docs/relnotes/10.3.html index 48b2052..bd7e409 100644 --- a/docs/relnotes/10.3.html +++ b/docs/relnotes/10.3.html @@ -46,6 +46,7 @@ Note: some of the new features are only available with certain drivers. ul liGL_ARB_draw_indirect on radeonsi/li liGL_ARB_multi_draw_indirect on radeonsi/li +liGL_ARB_sample_shading on radeonsi/li liGL_ARB_stencil_texturing on nv50, nvc0, r600, and radeonsi/li liGL_ARB_texture_cube_map_array on radeonsi/li /ul diff --git a/src/gallium/drivers/radeonsi/si_pipe.c b/src/gallium/drivers/radeonsi/si_pipe.c index 9fc1ea6..fd2329d 100644 --- a/src/gallium/drivers/radeonsi/si_pipe.c +++ b/src/gallium/drivers/radeonsi/si_pipe.c @@ -213,6 +213,7 @@ static int si_get_param(struct pipe_screen* pscreen, enum pipe_cap param) case PIPE_CAP_BUFFER_MAP_PERSISTENT_COHERENT: case PIPE_CAP_CUBE_MAP_ARRAY: case PIPE_CAP_DRAW_INDIRECT: + case PIPE_CAP_SAMPLE_SHADING: return 1; case PIPE_CAP_TEXTURE_MULTISAMPLE: @@ -246,7 +247,6 @@ static int si_get_param(struct pipe_screen* pscreen, enum pipe_cap param) case PIPE_CAP_TGSI_TEXCOORD: case PIPE_CAP_FAKE_SW_MSAA: case PIPE_CAP_TEXTURE_QUERY_LOD: -case PIPE_CAP_SAMPLE_SHADING: return 0; case PIPE_CAP_TEXTURE_BORDER_COLOR_QUIRK: -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 4/8] radeonsi: implement SAMPLEID fragment shader input
From: Marek Olšák marek.ol...@amd.com --- src/gallium/drivers/radeonsi/si_shader.c | 18 +- 1 file changed, 17 insertions(+), 1 deletion(-) diff --git a/src/gallium/drivers/radeonsi/si_shader.c b/src/gallium/drivers/radeonsi/si_shader.c index 440bcba..0195e54 100644 --- a/src/gallium/drivers/radeonsi/si_shader.c +++ b/src/gallium/drivers/radeonsi/si_shader.c @@ -531,6 +531,18 @@ static void declare_input_fs( } } +static LLVMValueRef get_sample_id(struct radeon_llvm_context *radeon_bld) +{ + struct gallivm_state *gallivm = radeon_bld-gallivm; + LLVMValueRef value = LLVMGetParam(radeon_bld-main_fn, + SI_PARAM_ANCILLARY); + value = LLVMBuildLShr(gallivm-builder, value, + lp_build_const_int32(gallivm, 8), ); + value = LLVMBuildAnd(gallivm-builder, value, +lp_build_const_int32(gallivm, 0xf), ); + return value; +} + static LLVMValueRef load_const(LLVMBuilderRef builder, LLVMValueRef resource, LLVMValueRef offset, LLVMTypeRef return_type) { @@ -560,6 +572,10 @@ static void declare_system_value( si_shader_ctx-param_vertex_id); break; + case TGSI_SEMANTIC_SAMPLEID: + value = get_sample_id(radeon_bld); + break; + default: assert(!unknown system value); return; @@ -2189,7 +2205,7 @@ static void create_function(struct si_shader_context *si_shader_ctx) params[SI_PARAM_POS_Z_FLOAT] = f32; params[SI_PARAM_POS_W_FLOAT] = f32; params[SI_PARAM_FRONT_FACE] = f32; - params[SI_PARAM_ANCILLARY] = f32; + params[SI_PARAM_ANCILLARY] = i32; params[SI_PARAM_SAMPLE_COVERAGE] = f32; params[SI_PARAM_POS_FIXED_PT] = f32; num_params = SI_PARAM_POS_FIXED_PT+1; -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/8] radeon: split cayman_emit_msaa_state into 2 functions
From: Marek Olšák marek.ol...@amd.com The other function will be split up from the framebuffer state. --- src/gallium/drivers/r600/evergreen_state.c| 3 ++- src/gallium/drivers/radeon/cayman_msaa.c | 26 +++--- src/gallium/drivers/radeon/r600_pipe_common.h | 4 +++- src/gallium/drivers/radeonsi/si_state.c | 3 ++- 4 files changed, 22 insertions(+), 14 deletions(-) diff --git a/src/gallium/drivers/r600/evergreen_state.c b/src/gallium/drivers/r600/evergreen_state.c index 7b1a44b..03dade0 100644 --- a/src/gallium/drivers/r600/evergreen_state.c +++ b/src/gallium/drivers/r600/evergreen_state.c @@ -1669,7 +1669,8 @@ static void evergreen_emit_framebuffer_state(struct r600_context *rctx, struct r if (rctx-b.chip_class == EVERGREEN) { evergreen_emit_msaa_state(rctx, rctx-framebuffer.nr_samples); } else { - cayman_emit_msaa_state(cs, rctx-framebuffer.nr_samples); + cayman_emit_msaa_sample_locs(cs, rctx-framebuffer.nr_samples); + cayman_emit_msaa_config(cs, rctx-framebuffer.nr_samples); } } diff --git a/src/gallium/drivers/radeon/cayman_msaa.c b/src/gallium/drivers/radeon/cayman_msaa.c index 9e6ceda..fa7deb6 100644 --- a/src/gallium/drivers/radeon/cayman_msaa.c +++ b/src/gallium/drivers/radeon/cayman_msaa.c @@ -123,27 +123,20 @@ void cayman_get_sample_position(struct pipe_context *ctx, unsigned sample_count, } } -void cayman_emit_msaa_state(struct radeon_winsys_cs *cs, int nr_samples) +void cayman_emit_msaa_sample_locs(struct radeon_winsys_cs *cs, int nr_samples) { - unsigned max_dist = 0; - switch (nr_samples) { - default: - nr_samples = 0; - break; case 2: r600_write_context_reg(cs, CM_R_028BF8_PA_SC_AA_SAMPLE_LOCS_PIXEL_X0Y0_0, eg_sample_locs_2x[0]); r600_write_context_reg(cs, CM_R_028C08_PA_SC_AA_SAMPLE_LOCS_PIXEL_X1Y0_0, eg_sample_locs_2x[1]); r600_write_context_reg(cs, CM_R_028C18_PA_SC_AA_SAMPLE_LOCS_PIXEL_X0Y1_0, eg_sample_locs_2x[2]); r600_write_context_reg(cs, CM_R_028C28_PA_SC_AA_SAMPLE_LOCS_PIXEL_X1Y1_0, eg_sample_locs_2x[3]); - max_dist = eg_max_dist_2x; break; case 4: r600_write_context_reg(cs, CM_R_028BF8_PA_SC_AA_SAMPLE_LOCS_PIXEL_X0Y0_0, eg_sample_locs_4x[0]); r600_write_context_reg(cs, CM_R_028C08_PA_SC_AA_SAMPLE_LOCS_PIXEL_X1Y0_0, eg_sample_locs_4x[1]); r600_write_context_reg(cs, CM_R_028C18_PA_SC_AA_SAMPLE_LOCS_PIXEL_X0Y1_0, eg_sample_locs_4x[2]); r600_write_context_reg(cs, CM_R_028C28_PA_SC_AA_SAMPLE_LOCS_PIXEL_X1Y1_0, eg_sample_locs_4x[3]); - max_dist = eg_max_dist_4x; break; case 8: r600_write_context_reg_seq(cs, CM_R_028BF8_PA_SC_AA_SAMPLE_LOCS_PIXEL_X0Y0_0, 14); @@ -161,7 +154,6 @@ void cayman_emit_msaa_state(struct radeon_winsys_cs *cs, int nr_samples) radeon_emit(cs, 0); radeon_emit(cs, cm_sample_locs_8x[3]); radeon_emit(cs, cm_sample_locs_8x[7]); - max_dist = cm_max_dist_8x; break; case 16: r600_write_context_reg_seq(cs, CM_R_028BF8_PA_SC_AA_SAMPLE_LOCS_PIXEL_X0Y0_0, 16); @@ -181,18 +173,30 @@ void cayman_emit_msaa_state(struct radeon_winsys_cs *cs, int nr_samples) radeon_emit(cs, cm_sample_locs_16x[7]); radeon_emit(cs, cm_sample_locs_16x[11]); radeon_emit(cs, cm_sample_locs_16x[15]); - max_dist = cm_max_dist_16x; break; } +} +void cayman_emit_msaa_config(struct radeon_winsys_cs *cs, int nr_samples, +int ps_iter_samples) +{ if (nr_samples 1) { + /* indexed by log2(nr_samples) */ + unsigned max_dist[] = { + 0, + eg_max_dist_2x, + eg_max_dist_4x, + cm_max_dist_8x, + cm_max_dist_16x + }; + unsigned log_samples = util_logbase2(nr_samples); r600_write_context_reg_seq(cs, CM_R_028BDC_PA_SC_LINE_CNTL, 2); radeon_emit(cs, S_028BDC_LAST_PIXEL(1) | S_028BDC_EXPAND_LINE_WIDTH(1)); /* CM_R_028BDC_PA_SC_LINE_CNTL */ radeon_emit(cs, S_028BE0_MSAA_NUM_SAMPLES(log_samples) | - S_028BE0_MAX_SAMPLE_DIST(max_dist) | + S_028BE0_MAX_SAMPLE_DIST(max_dist[log_samples]) | S_028BE0_MSAA_EXPOSED_SAMPLES(log_samples)); /* CM_R_028BE0_PA_SC_AA_CONFIG */ r600_write_context_reg(cs, CM_R_028804_DB_EQAA, diff --git a/src/gallium/drivers/radeon/r600_pipe_common.h b/src/gallium/drivers/radeon/r600_pipe_common.h index ada124f..4c3e9ce
[Mesa-dev] [PATCH 7/8] radeonsi: implement SAMPLEMASK fragment shader output
From: Marek Olšák marek.ol...@amd.com --- src/gallium/drivers/radeonsi/si_shader.c | 18 +++--- 1 file changed, 15 insertions(+), 3 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_shader.c b/src/gallium/drivers/radeonsi/si_shader.c index 5bd3057..d0d9837 100644 --- a/src/gallium/drivers/radeonsi/si_shader.c +++ b/src/gallium/drivers/radeonsi/si_shader.c @@ -1348,7 +1348,7 @@ static void si_llvm_emit_fs_epilogue(struct lp_build_tgsi_context * bld_base) LLVMValueRef args[9]; LLVMValueRef last_args[9] = { 0 }; unsigned semantic_name; - int depth_index = -1, stencil_index = -1; + int depth_index = -1, stencil_index = -1, samplemask_index = -1; int i; while (!tgsi_parse_end_of_tokens(parse)) { @@ -1381,6 +1381,9 @@ static void si_llvm_emit_fs_epilogue(struct lp_build_tgsi_context * bld_base) case TGSI_SEMANTIC_STENCIL: stencil_index = index; continue; + case TGSI_SEMANTIC_SAMPLEMASK: + samplemask_index = index; + continue; case TGSI_SEMANTIC_COLOR: target = V_008DFC_SQ_EXP_MRT + d-Semantic.Index; if (si_shader_ctx-shader-key.ps.alpha_to_one) @@ -1438,7 +1441,7 @@ static void si_llvm_emit_fs_epilogue(struct lp_build_tgsi_context * bld_base) } } - if (depth_index = 0 || stencil_index = 0) { + if (depth_index = 0 || stencil_index = 0 || samplemask_index = 0) { LLVMValueRef out_ptr; unsigned mask = 0; @@ -1468,7 +1471,16 @@ static void si_llvm_emit_fs_epilogue(struct lp_build_tgsi_context * bld_base) S_02880C_STENCIL_TEST_VAL_EXPORT_ENABLE(1); } - if (stencil_index = 0) + if (samplemask_index = 0) { + out_ptr = si_shader_ctx-radeon_bld.soa.outputs[samplemask_index][0]; + args[7] = LLVMBuildLoad(base-gallivm-builder, out_ptr, ); + mask |= 0xf; /* Set all components. */ + si_shader_ctx-shader-db_shader_control |= S_02880C_MASK_EXPORT_ENABLE(1); + } + + if (samplemask_index = 0) + si_shader_ctx-shader-spi_shader_z_format = V_028710_SPI_SHADER_32_ABGR; + else if (stencil_index = 0) si_shader_ctx-shader-spi_shader_z_format = V_028710_SPI_SHADER_32_GR; else si_shader_ctx-shader-spi_shader_z_format = V_028710_SPI_SHADER_32_R; -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 8/8] radeonsi: prepare depth export registers at compile time
From: Marek Olšák marek.ol...@amd.com --- src/gallium/drivers/radeonsi/si_shader.c | 8 src/gallium/drivers/radeonsi/si_shader.h | 2 ++ src/gallium/drivers/radeonsi/si_state_draw.c | 18 -- 3 files changed, 14 insertions(+), 14 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_shader.c b/src/gallium/drivers/radeonsi/si_shader.c index 363218f..440bcba 100644 --- a/src/gallium/drivers/radeonsi/si_shader.c +++ b/src/gallium/drivers/radeonsi/si_shader.c @@ -1415,6 +1415,7 @@ static void si_llvm_emit_fs_epilogue(struct lp_build_tgsi_context * bld_base) out_ptr = si_shader_ctx-radeon_bld.soa.outputs[depth_index][2]; args[5] = LLVMBuildLoad(base-gallivm-builder, out_ptr, ); mask |= 0x1; + si_shader_ctx-shader-db_shader_control |= S_02880C_Z_EXPORT_ENABLE(1); } if (stencil_index = 0) { @@ -1424,8 +1425,15 @@ static void si_llvm_emit_fs_epilogue(struct lp_build_tgsi_context * bld_base) * breaks some stencil piglit tests */ mask |= 0x3; + si_shader_ctx-shader-db_shader_control |= + S_02880C_STENCIL_TEST_VAL_EXPORT_ENABLE(1); } + if (stencil_index = 0) + si_shader_ctx-shader-spi_shader_z_format = V_028710_SPI_SHADER_32_GR; + else + si_shader_ctx-shader-spi_shader_z_format = V_028710_SPI_SHADER_32_R; + /* Specify which components to enable */ args[0] = lp_build_const_int32(base-gallivm, mask); diff --git a/src/gallium/drivers/radeonsi/si_shader.h b/src/gallium/drivers/radeonsi/si_shader.h index eb056ac..82382ec 100644 --- a/src/gallium/drivers/radeonsi/si_shader.h +++ b/src/gallium/drivers/radeonsi/si_shader.h @@ -181,6 +181,8 @@ struct si_pipe_shader { unsignedlds_size; unsignedspi_ps_input_ena; unsignedspi_shader_col_format; + unsignedspi_shader_z_format; + unsigneddb_shader_control; unsignedcb_shader_mask; boolcb0_is_integer; unsignedsprite_coord_enable; diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c b/src/gallium/drivers/radeonsi/si_state_draw.c index fce799c..8b27588 100644 --- a/src/gallium/drivers/radeonsi/si_state_draw.c +++ b/src/gallium/drivers/radeonsi/si_state_draw.c @@ -233,7 +233,7 @@ static void si_pipe_shader_ps(struct pipe_context *ctx, struct si_pipe_shader *s struct si_pm4_state *pm4; unsigned i, spi_ps_in_control, db_shader_control; unsigned num_sgprs, num_user_sgprs; - unsigned spi_baryc_cntl = 0, spi_ps_input_ena, spi_shader_z_format; + unsigned spi_baryc_cntl = 0, spi_ps_input_ena; uint64_t va; si_pm4_delete_state(sctx, ps, shader-pm4); @@ -264,12 +264,8 @@ static void si_pipe_shader_ps(struct pipe_context *ctx, struct si_pipe_shader *s } } - for (i = 0; i shader-shader.noutput; i++) { - if (shader-shader.output[i].name == TGSI_SEMANTIC_POSITION) - db_shader_control |= S_02880C_Z_EXPORT_ENABLE(1); - if (shader-shader.output[i].name == TGSI_SEMANTIC_STENCIL) - db_shader_control |= S_02880C_STENCIL_TEST_VAL_EXPORT_ENABLE(1); - } + db_shader_control |= shader-db_shader_control; + if (shader-shader.uses_kill || shader-key.ps.alpha_func != PIPE_FUNC_ALWAYS) db_shader_control |= S_02880C_KILL_ENABLE(1); @@ -292,13 +288,7 @@ static void si_pipe_shader_ps(struct pipe_context *ctx, struct si_pipe_shader *s si_pm4_set_reg(pm4, R_0286D0_SPI_PS_INPUT_ADDR, spi_ps_input_ena); si_pm4_set_reg(pm4, R_0286D8_SPI_PS_IN_CONTROL, spi_ps_in_control); - if (G_02880C_STENCIL_TEST_VAL_EXPORT_ENABLE(db_shader_control)) - spi_shader_z_format = V_028710_SPI_SHADER_32_GR; - else if (G_02880C_Z_EXPORT_ENABLE(db_shader_control)) - spi_shader_z_format = V_028710_SPI_SHADER_32_R; - else - spi_shader_z_format = 0; - si_pm4_set_reg(pm4, R_028710_SPI_SHADER_Z_FORMAT, spi_shader_z_format); + si_pm4_set_reg(pm4, R_028710_SPI_SHADER_Z_FORMAT, shader-spi_shader_z_format); si_pm4_set_reg(pm4, R_028714_SPI_SHADER_COL_FORMAT, shader-spi_shader_col_format); si_pm4_set_reg(pm4, R_02823C_CB_SHADER_MASK, shader-cb_shader_mask); -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/8] radeonsi: implement set_min_samples
On Wed, May 7, 2014 at 10:00 AM, Marek Olšák mar...@gmail.com wrote: +static void si_set_min_samples(struct pipe_context *ctx, unsigned min_samples) +{ + struct si_context *sctx = (struct si_context *)ctx; + + if (sctx-ps_iter_samples == min_samples) + return; + + sctx-ps_iter_samples = min_samples; + + if (sctx-framebuffer.nr_samples 1) + sctx-msaa_config.dirty = true; } I don't know your HW, but keep in mind that there's nothing preventing one from using gl_SamplePosition/etc without a MS framebuffer. With the piglit tests, that's passing '1' or '0' as the arguments for the number of samples. -ilia ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 78393] Black zebra like lines while playing games on open source drivers
https://bugs.freedesktop.org/show_bug.cgi?id=78393 Andreas Boll andreas.boll@gmail.com changed: What|Removed |Added Attachment #98627|text/plain |image/png mime type|| -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Mixing of hardware and software renderers
Am 06.05.2014 06:51, schrieb Patrick McMunn: I'm using some older hardware - an ATI Radeon 9200 - which can only handle up to OpenGL 1.2. I was wondering if it's possible to use the hardware renderer generally and have the driver hand off the handling of functions which my video card can't handle (such as functions from a higher OpenGL version) to the software render and then the software render hand control back to the hardware renderer once it's finished. If this isn't currently possible, is this perhaps a feature which might appear in the future? No this is not useful nor practical (see Dav'es answer). IIRC correctly though the card should support GL 1.3, and everything required by GL 1.4 except ARB_shadow/ARB_depth_texture. Roland ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] st/wgl: Honour request of 3.1 contexts through core profile where available.
On 05/07/2014 07:11 AM, jfons...@vmware.com wrote: From: José Fonseca jfons...@vmware.com Port 5f493eed69f6fb11239c04119d602f1c23a68cbd from GLX. --- src/gallium/state_trackers/wgl/stw_context.c | 17 +++-- 1 file changed, 15 insertions(+), 2 deletions(-) diff --git a/src/gallium/state_trackers/wgl/stw_context.c b/src/gallium/state_trackers/wgl/stw_context.c index 3a93091..43186fa 100644 --- a/src/gallium/state_trackers/wgl/stw_context.c +++ b/src/gallium/state_trackers/wgl/stw_context.c @@ -205,10 +205,23 @@ stw_create_context_attribs( * * The default value for WGL_CONTEXT_PROFILE_MASK_ARB is * WGL_CONTEXT_CORE_PROFILE_BIT_ARB. +* +* The spec also says: +* +* If version 3.1 is requested, the context returned may implement +* any of the following versions: +* +* * Version 3.1. The GL_ARB_compatibility extension may or may not +* be implemented, as determined by the implementation. +* * The core profile of version 3.2 or greater. +* +* and because Mesa doesn't support GL_ARB_compatibility, the only chance to +* honour a 3.1 context is through core profile. */ attribs.profile = ST_PROFILE_DEFAULT; - if ((majorVersion 3 || (majorVersion == 3 minorVersion = 2)) -((profileMask WGL_CONTEXT_COMPATIBILITY_PROFILE_BIT_ARB) == 0)) + if (((majorVersion 3 || (majorVersion == 3 minorVersion = 2)) + ((profileMask WGL_CONTEXT_COMPATIBILITY_PROFILE_BIT_ARB) == 0)) || + (majorVersion == 3 minorVersion == 1)) attribs.profile = ST_PROFILE_OPENGL_CORE; ctx-st = stw_dev-stapi-create_context(stw_dev-stapi, For both: Reviewed-by: Brian Paul bri...@vmware.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 78393] Black zebra like lines while playing games on open source drivers
https://bugs.freedesktop.org/show_bug.cgi?id=78393 incarnated...@gmail.com changed: What|Removed |Added Hardware|Other |x86-64 (AMD64) OS|All |Linux (All) Priority|medium |high -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/8] radeonsi: implement set_min_samples
Yeah, the sample positions in the fragment shader are always set correctly even if there is no multisample buffer. It's implemented in the patch which adds support for SAMPLEPOS. Marek On Wed, May 7, 2014 at 4:12 PM, Ilia Mirkin imir...@alum.mit.edu wrote: On Wed, May 7, 2014 at 10:00 AM, Marek Olšák mar...@gmail.com wrote: +static void si_set_min_samples(struct pipe_context *ctx, unsigned min_samples) +{ + struct si_context *sctx = (struct si_context *)ctx; + + if (sctx-ps_iter_samples == min_samples) + return; + + sctx-ps_iter_samples = min_samples; + + if (sctx-framebuffer.nr_samples 1) + sctx-msaa_config.dirty = true; } I don't know your HW, but keep in mind that there's nothing preventing one from using gl_SamplePosition/etc without a MS framebuffer. With the piglit tests, that's passing '1' or '0' as the arguments for the number of samples. -ilia ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 5/8] radeonsi: implement SAMPLEPOS fragment shader input
On Wed, May 7, 2014 at 10:00 AM, Marek Olšák mar...@gmail.com wrote: From: Marek Olšák marek.ol...@amd.com The sample positions are read from a constant buffer. FWIW I had to do the same thing in both nv50 and nvc0. I had proposed that this should actually be done directly by the state tracker, but I think claims were made to the effect that there might be hw which can look up the positions directly. I figured radeon was such hw, but I guess not. Anyways, seems like a nice refactor would be to have mesa/st (or mesa/main) supply the constbufs and rip out the extra logic from nv50/nvc0/radeonsi. [Certainly don't have to do it now, esp now that you've done it this way.] The sample count is already handled this way by mesa core. -ilia --- src/gallium/drivers/radeon/cayman_msaa.c | 17 + src/gallium/drivers/radeon/r600_pipe_common.c | 1 + src/gallium/drivers/radeon/r600_pipe_common.h | 10 ++ src/gallium/drivers/radeonsi/si_shader.c | 23 +++ src/gallium/drivers/radeonsi/si_state.c | 25 + 5 files changed, 76 insertions(+) diff --git a/src/gallium/drivers/radeon/cayman_msaa.c b/src/gallium/drivers/radeon/cayman_msaa.c index 8727f3e..47fc5c4 100644 --- a/src/gallium/drivers/radeon/cayman_msaa.c +++ b/src/gallium/drivers/radeon/cayman_msaa.c @@ -123,6 +123,23 @@ void cayman_get_sample_position(struct pipe_context *ctx, unsigned sample_count, } } +void cayman_init_msaa(struct pipe_context *ctx) +{ + struct r600_common_context *rctx = (struct r600_common_context*)ctx; + int i; + + cayman_get_sample_position(ctx, 1, 0, rctx-sample_locations_1x[0]); + + for (i = 0; i 2; i++) + cayman_get_sample_position(ctx, 2, i, rctx-sample_locations_2x[i]); + for (i = 0; i 4; i++) + cayman_get_sample_position(ctx, 4, i, rctx-sample_locations_4x[i]); + for (i = 0; i 8; i++) + cayman_get_sample_position(ctx, 8, i, rctx-sample_locations_8x[i]); + for (i = 0; i 16; i++) + cayman_get_sample_position(ctx, 16, i, rctx-sample_locations_16x[i]); +} + void cayman_emit_msaa_sample_locs(struct radeon_winsys_cs *cs, int nr_samples) { switch (nr_samples) { diff --git a/src/gallium/drivers/radeon/r600_pipe_common.c b/src/gallium/drivers/radeon/r600_pipe_common.c index 70c4d1a..4c6cf0e 100644 --- a/src/gallium/drivers/radeon/r600_pipe_common.c +++ b/src/gallium/drivers/radeon/r600_pipe_common.c @@ -154,6 +154,7 @@ bool r600_common_context_init(struct r600_common_context *rctx, r600_init_context_texture_functions(rctx); r600_streamout_init(rctx); r600_query_init(rctx); + cayman_init_msaa(rctx-b); rctx-allocator_so_filled_size = u_suballocator_create(rctx-b, 4096, 4, 0, PIPE_USAGE_DEFAULT, TRUE); diff --git a/src/gallium/drivers/radeon/r600_pipe_common.h b/src/gallium/drivers/radeon/r600_pipe_common.h index 4c3e9ce..8862d31 100644 --- a/src/gallium/drivers/radeon/r600_pipe_common.h +++ b/src/gallium/drivers/radeon/r600_pipe_common.h @@ -357,6 +357,15 @@ struct r600_common_context { boolean saved_render_cond_cond; unsignedsaved_render_cond_mode; + /* MSAA sample locations. +* The first index is the sample index. +* The second index is the coordinate: X, Y. */ + float sample_locations_1x[1][2]; + float sample_locations_2x[2][2]; + float sample_locations_4x[4][2]; + float sample_locations_8x[8][2]; + float sample_locations_16x[16][2]; + /* Copy one resource to another using async DMA. */ void (*dma_copy)(struct pipe_context *ctx, struct pipe_resource *dst, @@ -472,6 +481,7 @@ extern const uint32_t eg_sample_locs_4x[4]; extern const unsigned eg_max_dist_4x; void cayman_get_sample_position(struct pipe_context *ctx, unsigned sample_count, unsigned sample_index, float *out_value); +void cayman_init_msaa(struct pipe_context *ctx); void cayman_emit_msaa_sample_locs(struct radeon_winsys_cs *cs, int nr_samples); void cayman_emit_msaa_config(struct radeon_winsys_cs *cs, int nr_samples, int ps_iter_samples); diff --git a/src/gallium/drivers/radeonsi/si_shader.c b/src/gallium/drivers/radeonsi/si_shader.c index 0195e54..d8b7b9c 100644 --- a/src/gallium/drivers/radeonsi/si_shader.c +++ b/src/gallium/drivers/radeonsi/si_shader.c @@ -559,6 +559,8 @@ static void declare_system_value( { struct si_shader_context *si_shader_ctx = si_shader_context(radeon_bld-soa.bld_base); +
Re: [Mesa-dev] [PATCH] gallium: remove enum numbers from shader cap queries
On 05/07/2014 01:33 AM, Michel Dänzer wrote: On 05.05.2014 22:39, Brian Paul wrote: On 05/03/2014 07:43 AM, Michel Dänzer wrote: On 03.05.2014 22:29, Brian Paul wrote: The enum numbers were just cruft. I disagree. Nothing's changed about the reason I added them in the first place: When a driver is queried for a cap it doesn't know about, it prints an error message containing only the numeric value of the cap. These explicit numbers make it easy to find out which cap the driver is complaining about. Hi Michel, In the past when someone added a new enum and softpipe, llvmpipe or svga complained at runtime about an unhandled num, it's been pretty easy to spot the new one and fix it. Actually, what you have in the radeon/si drivers is better: switch statements w/out default cases. So the compiler will warn about the missing enum case by name (not number). That's a more effective way of catching unhandled enums earlier. I should change softpipe, llvmpipe and svga to do the same. How does that sound? Are there other drivers you're concerned about? Not really, sounds good to me then. Does that count as a R-b for the original patch? -Brian ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] gallium: remove enum numbers from shader cap queries
On 08.05.2014 00:02, Brian Paul wrote: On 05/07/2014 01:33 AM, Michel Dänzer wrote: On 05.05.2014 22:39, Brian Paul wrote: On 05/03/2014 07:43 AM, Michel Dänzer wrote: On 03.05.2014 22:29, Brian Paul wrote: The enum numbers were just cruft. I disagree. Nothing's changed about the reason I added them in the first place: When a driver is queried for a cap it doesn't know about, it prints an error message containing only the numeric value of the cap. These explicit numbers make it easy to find out which cap the driver is complaining about. Hi Michel, In the past when someone added a new enum and softpipe, llvmpipe or svga complained at runtime about an unhandled num, it's been pretty easy to spot the new one and fix it. Actually, what you have in the radeon/si drivers is better: switch statements w/out default cases. So the compiler will warn about the missing enum case by name (not number). That's a more effective way of catching unhandled enums earlier. I should change softpipe, llvmpipe and svga to do the same. How does that sound? Are there other drivers you're concerned about? Not really, sounds good to me then. Does that count as a R-b for the original patch? Yes, Reviewed-by: Michel Dänzer mic...@daenzer.net Thanks Brian. -- Earthling Michel Dänzer| http://www.amd.com Libre software enthusiast |Mesa and X developer ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 5/8] radeonsi: implement SAMPLEPOS fragment shader input
I could enable interpolation of gl_FragCoord at the sample instead of centroid and do: gl_SamplePos = fract(gl_FragCoord.xy); gl_FragCoord.xy = floor(gl_FragCoord.xy) + vec2(0.5); // center However, I wouldn't be able to get gl_FragCoord at the centroid. I'm also not sure how gl_FragCoord should be interpolated if sample shading is enabled. Marek On Wed, May 7, 2014 at 5:00 PM, Ilia Mirkin imir...@alum.mit.edu wrote: On Wed, May 7, 2014 at 10:00 AM, Marek Olšák mar...@gmail.com wrote: From: Marek Olšák marek.ol...@amd.com The sample positions are read from a constant buffer. FWIW I had to do the same thing in both nv50 and nvc0. I had proposed that this should actually be done directly by the state tracker, but I think claims were made to the effect that there might be hw which can look up the positions directly. I figured radeon was such hw, but I guess not. Anyways, seems like a nice refactor would be to have mesa/st (or mesa/main) supply the constbufs and rip out the extra logic from nv50/nvc0/radeonsi. [Certainly don't have to do it now, esp now that you've done it this way.] The sample count is already handled this way by mesa core. -ilia --- src/gallium/drivers/radeon/cayman_msaa.c | 17 + src/gallium/drivers/radeon/r600_pipe_common.c | 1 + src/gallium/drivers/radeon/r600_pipe_common.h | 10 ++ src/gallium/drivers/radeonsi/si_shader.c | 23 +++ src/gallium/drivers/radeonsi/si_state.c | 25 + 5 files changed, 76 insertions(+) diff --git a/src/gallium/drivers/radeon/cayman_msaa.c b/src/gallium/drivers/radeon/cayman_msaa.c index 8727f3e..47fc5c4 100644 --- a/src/gallium/drivers/radeon/cayman_msaa.c +++ b/src/gallium/drivers/radeon/cayman_msaa.c @@ -123,6 +123,23 @@ void cayman_get_sample_position(struct pipe_context *ctx, unsigned sample_count, } } +void cayman_init_msaa(struct pipe_context *ctx) +{ + struct r600_common_context *rctx = (struct r600_common_context*)ctx; + int i; + + cayman_get_sample_position(ctx, 1, 0, rctx-sample_locations_1x[0]); + + for (i = 0; i 2; i++) + cayman_get_sample_position(ctx, 2, i, rctx-sample_locations_2x[i]); + for (i = 0; i 4; i++) + cayman_get_sample_position(ctx, 4, i, rctx-sample_locations_4x[i]); + for (i = 0; i 8; i++) + cayman_get_sample_position(ctx, 8, i, rctx-sample_locations_8x[i]); + for (i = 0; i 16; i++) + cayman_get_sample_position(ctx, 16, i, rctx-sample_locations_16x[i]); +} + void cayman_emit_msaa_sample_locs(struct radeon_winsys_cs *cs, int nr_samples) { switch (nr_samples) { diff --git a/src/gallium/drivers/radeon/r600_pipe_common.c b/src/gallium/drivers/radeon/r600_pipe_common.c index 70c4d1a..4c6cf0e 100644 --- a/src/gallium/drivers/radeon/r600_pipe_common.c +++ b/src/gallium/drivers/radeon/r600_pipe_common.c @@ -154,6 +154,7 @@ bool r600_common_context_init(struct r600_common_context *rctx, r600_init_context_texture_functions(rctx); r600_streamout_init(rctx); r600_query_init(rctx); + cayman_init_msaa(rctx-b); rctx-allocator_so_filled_size = u_suballocator_create(rctx-b, 4096, 4, 0, PIPE_USAGE_DEFAULT, TRUE); diff --git a/src/gallium/drivers/radeon/r600_pipe_common.h b/src/gallium/drivers/radeon/r600_pipe_common.h index 4c3e9ce..8862d31 100644 --- a/src/gallium/drivers/radeon/r600_pipe_common.h +++ b/src/gallium/drivers/radeon/r600_pipe_common.h @@ -357,6 +357,15 @@ struct r600_common_context { boolean saved_render_cond_cond; unsignedsaved_render_cond_mode; + /* MSAA sample locations. +* The first index is the sample index. +* The second index is the coordinate: X, Y. */ + float sample_locations_1x[1][2]; + float sample_locations_2x[2][2]; + float sample_locations_4x[4][2]; + float sample_locations_8x[8][2]; + float sample_locations_16x[16][2]; + /* Copy one resource to another using async DMA. */ void (*dma_copy)(struct pipe_context *ctx, struct pipe_resource *dst, @@ -472,6 +481,7 @@ extern const uint32_t eg_sample_locs_4x[4]; extern const unsigned eg_max_dist_4x; void cayman_get_sample_position(struct pipe_context *ctx, unsigned sample_count, unsigned sample_index, float *out_value); +void cayman_init_msaa(struct pipe_context *ctx); void cayman_emit_msaa_sample_locs(struct radeon_winsys_cs *cs, int nr_samples); void cayman_emit_msaa_config(struct radeon_winsys_cs *cs, int nr_samples,
Re: [Mesa-dev] [PATCH 5/8] radeonsi: implement SAMPLEPOS fragment shader input
On Wed, May 7, 2014 at 11:56 AM, Marek Olšák mar...@gmail.com wrote: I could enable interpolation of gl_FragCoord at the sample instead of centroid and do: gl_SamplePos = fract(gl_FragCoord.xy); Is that legal? I didn't think that sample position could be an output, at least not with ARB_sample_shading (and not even with ARB_gpu_shader5). Perhaps some other extension adds it. gl_FragCoord.xy = floor(gl_FragCoord.xy) + vec2(0.5); // center However, I wouldn't be able to get gl_FragCoord at the centroid. I'm also not sure how gl_FragCoord should be interpolated if sample shading is enabled. Marek On Wed, May 7, 2014 at 5:00 PM, Ilia Mirkin imir...@alum.mit.edu wrote: On Wed, May 7, 2014 at 10:00 AM, Marek Olšák mar...@gmail.com wrote: From: Marek Olšák marek.ol...@amd.com The sample positions are read from a constant buffer. FWIW I had to do the same thing in both nv50 and nvc0. I had proposed that this should actually be done directly by the state tracker, but I think claims were made to the effect that there might be hw which can look up the positions directly. I figured radeon was such hw, but I guess not. Anyways, seems like a nice refactor would be to have mesa/st (or mesa/main) supply the constbufs and rip out the extra logic from nv50/nvc0/radeonsi. [Certainly don't have to do it now, esp now that you've done it this way.] The sample count is already handled this way by mesa core. -ilia --- src/gallium/drivers/radeon/cayman_msaa.c | 17 + src/gallium/drivers/radeon/r600_pipe_common.c | 1 + src/gallium/drivers/radeon/r600_pipe_common.h | 10 ++ src/gallium/drivers/radeonsi/si_shader.c | 23 +++ src/gallium/drivers/radeonsi/si_state.c | 25 + 5 files changed, 76 insertions(+) diff --git a/src/gallium/drivers/radeon/cayman_msaa.c b/src/gallium/drivers/radeon/cayman_msaa.c index 8727f3e..47fc5c4 100644 --- a/src/gallium/drivers/radeon/cayman_msaa.c +++ b/src/gallium/drivers/radeon/cayman_msaa.c @@ -123,6 +123,23 @@ void cayman_get_sample_position(struct pipe_context *ctx, unsigned sample_count, } } +void cayman_init_msaa(struct pipe_context *ctx) +{ + struct r600_common_context *rctx = (struct r600_common_context*)ctx; + int i; + + cayman_get_sample_position(ctx, 1, 0, rctx-sample_locations_1x[0]); + + for (i = 0; i 2; i++) + cayman_get_sample_position(ctx, 2, i, rctx-sample_locations_2x[i]); + for (i = 0; i 4; i++) + cayman_get_sample_position(ctx, 4, i, rctx-sample_locations_4x[i]); + for (i = 0; i 8; i++) + cayman_get_sample_position(ctx, 8, i, rctx-sample_locations_8x[i]); + for (i = 0; i 16; i++) + cayman_get_sample_position(ctx, 16, i, rctx-sample_locations_16x[i]); +} + void cayman_emit_msaa_sample_locs(struct radeon_winsys_cs *cs, int nr_samples) { switch (nr_samples) { diff --git a/src/gallium/drivers/radeon/r600_pipe_common.c b/src/gallium/drivers/radeon/r600_pipe_common.c index 70c4d1a..4c6cf0e 100644 --- a/src/gallium/drivers/radeon/r600_pipe_common.c +++ b/src/gallium/drivers/radeon/r600_pipe_common.c @@ -154,6 +154,7 @@ bool r600_common_context_init(struct r600_common_context *rctx, r600_init_context_texture_functions(rctx); r600_streamout_init(rctx); r600_query_init(rctx); + cayman_init_msaa(rctx-b); rctx-allocator_so_filled_size = u_suballocator_create(rctx-b, 4096, 4, 0, PIPE_USAGE_DEFAULT, TRUE); diff --git a/src/gallium/drivers/radeon/r600_pipe_common.h b/src/gallium/drivers/radeon/r600_pipe_common.h index 4c3e9ce..8862d31 100644 --- a/src/gallium/drivers/radeon/r600_pipe_common.h +++ b/src/gallium/drivers/radeon/r600_pipe_common.h @@ -357,6 +357,15 @@ struct r600_common_context { boolean saved_render_cond_cond; unsignedsaved_render_cond_mode; + /* MSAA sample locations. +* The first index is the sample index. +* The second index is the coordinate: X, Y. */ + float sample_locations_1x[1][2]; + float sample_locations_2x[2][2]; + float sample_locations_4x[4][2]; + float sample_locations_8x[8][2]; + float sample_locations_16x[16][2]; + /* Copy one resource to another using async DMA. */ void (*dma_copy)(struct pipe_context *ctx, struct pipe_resource *dst, @@ -472,6 +481,7 @@ extern const uint32_t eg_sample_locs_4x[4]; extern const unsigned eg_max_dist_4x; void cayman_get_sample_position(struct pipe_context *ctx, unsigned sample_count,
Re: [Mesa-dev] [PATCH 5/8] radeonsi: implement SAMPLEPOS fragment shader input
It's a pseudo-code that would be inserted at the beginning of the shader by the shader compiler. gl_SamplePos is always read-only from a user's point of view. Marek On Wed, May 7, 2014 at 6:02 PM, Ilia Mirkin imir...@alum.mit.edu wrote: On Wed, May 7, 2014 at 11:56 AM, Marek Olšák mar...@gmail.com wrote: I could enable interpolation of gl_FragCoord at the sample instead of centroid and do: gl_SamplePos = fract(gl_FragCoord.xy); Is that legal? I didn't think that sample position could be an output, at least not with ARB_sample_shading (and not even with ARB_gpu_shader5). Perhaps some other extension adds it. gl_FragCoord.xy = floor(gl_FragCoord.xy) + vec2(0.5); // center However, I wouldn't be able to get gl_FragCoord at the centroid. I'm also not sure how gl_FragCoord should be interpolated if sample shading is enabled. Marek On Wed, May 7, 2014 at 5:00 PM, Ilia Mirkin imir...@alum.mit.edu wrote: On Wed, May 7, 2014 at 10:00 AM, Marek Olšák mar...@gmail.com wrote: From: Marek Olšák marek.ol...@amd.com The sample positions are read from a constant buffer. FWIW I had to do the same thing in both nv50 and nvc0. I had proposed that this should actually be done directly by the state tracker, but I think claims were made to the effect that there might be hw which can look up the positions directly. I figured radeon was such hw, but I guess not. Anyways, seems like a nice refactor would be to have mesa/st (or mesa/main) supply the constbufs and rip out the extra logic from nv50/nvc0/radeonsi. [Certainly don't have to do it now, esp now that you've done it this way.] The sample count is already handled this way by mesa core. -ilia --- src/gallium/drivers/radeon/cayman_msaa.c | 17 + src/gallium/drivers/radeon/r600_pipe_common.c | 1 + src/gallium/drivers/radeon/r600_pipe_common.h | 10 ++ src/gallium/drivers/radeonsi/si_shader.c | 23 +++ src/gallium/drivers/radeonsi/si_state.c | 25 + 5 files changed, 76 insertions(+) diff --git a/src/gallium/drivers/radeon/cayman_msaa.c b/src/gallium/drivers/radeon/cayman_msaa.c index 8727f3e..47fc5c4 100644 --- a/src/gallium/drivers/radeon/cayman_msaa.c +++ b/src/gallium/drivers/radeon/cayman_msaa.c @@ -123,6 +123,23 @@ void cayman_get_sample_position(struct pipe_context *ctx, unsigned sample_count, } } +void cayman_init_msaa(struct pipe_context *ctx) +{ + struct r600_common_context *rctx = (struct r600_common_context*)ctx; + int i; + + cayman_get_sample_position(ctx, 1, 0, rctx-sample_locations_1x[0]); + + for (i = 0; i 2; i++) + cayman_get_sample_position(ctx, 2, i, rctx-sample_locations_2x[i]); + for (i = 0; i 4; i++) + cayman_get_sample_position(ctx, 4, i, rctx-sample_locations_4x[i]); + for (i = 0; i 8; i++) + cayman_get_sample_position(ctx, 8, i, rctx-sample_locations_8x[i]); + for (i = 0; i 16; i++) + cayman_get_sample_position(ctx, 16, i, rctx-sample_locations_16x[i]); +} + void cayman_emit_msaa_sample_locs(struct radeon_winsys_cs *cs, int nr_samples) { switch (nr_samples) { diff --git a/src/gallium/drivers/radeon/r600_pipe_common.c b/src/gallium/drivers/radeon/r600_pipe_common.c index 70c4d1a..4c6cf0e 100644 --- a/src/gallium/drivers/radeon/r600_pipe_common.c +++ b/src/gallium/drivers/radeon/r600_pipe_common.c @@ -154,6 +154,7 @@ bool r600_common_context_init(struct r600_common_context *rctx, r600_init_context_texture_functions(rctx); r600_streamout_init(rctx); r600_query_init(rctx); + cayman_init_msaa(rctx-b); rctx-allocator_so_filled_size = u_suballocator_create(rctx-b, 4096, 4, 0, PIPE_USAGE_DEFAULT, TRUE); diff --git a/src/gallium/drivers/radeon/r600_pipe_common.h b/src/gallium/drivers/radeon/r600_pipe_common.h index 4c3e9ce..8862d31 100644 --- a/src/gallium/drivers/radeon/r600_pipe_common.h +++ b/src/gallium/drivers/radeon/r600_pipe_common.h @@ -357,6 +357,15 @@ struct r600_common_context { boolean saved_render_cond_cond; unsignedsaved_render_cond_mode; + /* MSAA sample locations. +* The first index is the sample index. +* The second index is the coordinate: X, Y. */ + float sample_locations_1x[1][2]; + float sample_locations_2x[2][2]; + float sample_locations_4x[4][2]; + float sample_locations_8x[8][2]; + float sample_locations_16x[16][2]; + /* Copy one resource to another using async DMA. */ void (*dma_copy)(struct pipe_context *ctx,
[Mesa-dev] [Bug 78258] make check link_varyings.gl_ClipDistance failure
https://bugs.freedesktop.org/show_bug.cgi?id=78258 Ian Romanick i...@freedesktop.org changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #3 from Ian Romanick i...@freedesktop.org --- Fixed on master by the commit below. This patch has also been cherry picked to the 10.2 branch. commit f7bf37cb13ff4e727d640a3bd02980aba0c0b4ce Author: Ian Romanick ian.d.roman...@intel.com Date: Mon May 5 10:39:26 2014 -0700 linker: Fix consumer_inputs_with_locations indexing In an earlier incarnation of populate_consumer_input_sets and get_matching_input, the consumer_inputs_with_locations array was indexed using the user-specified location. In that version, only user-defined varyings were included in the array. In the current incarnation, the Mesa location is used to index the array, and built-in varyings are included. This change fixes the unit test to exepect gl_ClipDistance in the array, and it resizes the arrays to actually be big enough. It's just dumb luck that the existing piglit tests use small enough locations to not stomp the stack. :( Signed-off-by: Ian Romanick ian.d.roman...@intel.com Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78258 Reviewed-by: Kenneth Graunke kenn...@whitecape.org Cc: 10.2 mesa-sta...@lists.freedesktop.org Cc: Vinson Lee v...@freedesktop.org -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] draw: do not use draw_get_option_use_llvm() inside draw execution paths
From: Roland Scheidegger srol...@vmware.com 1c73e919a4b4dd79166d0633075990056f27fd28 made it possible to not allocate the tgsi machine if llvm was used. However, draw_get_option_use_llvm() is not reliable after draw context creation, since drivers can explicitly request a non-llvm draw context even if draw_get_option_use_llvm() would return true (and softpipe does just that) which leads to crashes. Thus use draw-llvm to determine if we're using llvm or not instead (and make draw-llvm available even if HAVE_LLVM is false so we don't have to put even more ifdefs). Cc: 10.2 mesa-sta...@lists.freedesktop.org --- src/gallium/auxiliary/draw/draw_context.c | 2 ++ src/gallium/auxiliary/draw/draw_gs.c | 10 +- src/gallium/auxiliary/draw/draw_private.h | 4 +--- src/gallium/auxiliary/draw/draw_vs.c | 4 ++-- src/gallium/auxiliary/draw/draw_vs_exec.c | 4 ++-- 5 files changed, 12 insertions(+), 12 deletions(-) diff --git a/src/gallium/auxiliary/draw/draw_context.c b/src/gallium/auxiliary/draw/draw_context.c index ddc305b..d7197fd 100644 --- a/src/gallium/auxiliary/draw/draw_context.c +++ b/src/gallium/auxiliary/draw/draw_context.c @@ -1000,6 +1000,8 @@ draw_get_shader_param_no_llvm(unsigned shader, enum pipe_shader_cap param) /** * XXX: Results for PIPE_SHADER_CAP_MAX_TEXTURE_SAMPLERS because there are two * different ways of setting textures, and drivers typically only support one. + * Drivers requesting a draw context explicitly without llvm must call + * draw_get_shader_param_no_llvm instead. */ int draw_get_shader_param(unsigned shader, enum pipe_shader_cap param) diff --git a/src/gallium/auxiliary/draw/draw_gs.c b/src/gallium/auxiliary/draw/draw_gs.c index 5e503ff..fc4f697 100644 --- a/src/gallium/auxiliary/draw/draw_gs.c +++ b/src/gallium/auxiliary/draw/draw_gs.c @@ -597,7 +597,7 @@ int draw_geometry_shader_run(struct draw_geometry_shader *shader, #ifdef HAVE_LLVM - if (draw_get_option_use_llvm()) { + if (shader-draw-llvm) { shader-gs_output = output_verts-verts; if (max_out_prims shader-max_out_prims) { unsigned i; @@ -674,7 +674,7 @@ int draw_geometry_shader_run(struct draw_geometry_shader *shader, void draw_geometry_shader_prepare(struct draw_geometry_shader *shader, struct draw_context *draw) { - boolean use_llvm = draw_get_option_use_llvm(); + boolean use_llvm = draw-llvm != NULL; if (!use_llvm shader shader-machine-Tokens != shader-state.tokens) { tgsi_exec_machine_bind_shader(shader-machine, shader-state.tokens, @@ -686,7 +686,7 @@ void draw_geometry_shader_prepare(struct draw_geometry_shader *shader, boolean draw_gs_init( struct draw_context *draw ) { - if (!draw_get_option_use_llvm()) { + if (!draw-llvm) { draw-gs.tgsi.machine = tgsi_exec_machine_create(); if (!draw-gs.tgsi.machine) return FALSE; @@ -715,7 +715,7 @@ draw_create_geometry_shader(struct draw_context *draw, const struct pipe_shader_state *state) { #ifdef HAVE_LLVM - boolean use_llvm = draw_get_option_use_llvm(); + boolean use_llvm = draw-llvm != NULL; struct llvm_geometry_shader *llvm_gs; #endif struct draw_geometry_shader *gs; @@ -870,7 +870,7 @@ void draw_delete_geometry_shader(struct draw_context *draw, return; } #ifdef HAVE_LLVM - if (draw_get_option_use_llvm()) { + if (draw-llvm) { struct llvm_geometry_shader *shader = llvm_geometry_shader(dgs); struct draw_gs_llvm_variant_list_item *li; diff --git a/src/gallium/auxiliary/draw/draw_private.h b/src/gallium/auxiliary/draw/draw_private.h index 801d009..783c3ef 100644 --- a/src/gallium/auxiliary/draw/draw_private.h +++ b/src/gallium/auxiliary/draw/draw_private.h @@ -47,7 +47,6 @@ #include tgsi/tgsi_scan.h #ifdef HAVE_LLVM -struct draw_llvm; struct gallivm_state; #endif @@ -69,6 +68,7 @@ struct tgsi_exec_machine; struct tgsi_sampler; struct draw_pt_front_end; struct draw_assembler; +struct draw_llvm; /** @@ -318,9 +318,7 @@ struct draw_context unsigned start_instance; unsigned start_index; -#ifdef HAVE_LLVM struct draw_llvm *llvm; -#endif /** Texture sampler and sampler view state. * Note that we have arrays indexed by shader type. At this time diff --git a/src/gallium/auxiliary/draw/draw_vs.c b/src/gallium/auxiliary/draw/draw_vs.c index eb7f4e0..dc50870 100644 --- a/src/gallium/auxiliary/draw/draw_vs.c +++ b/src/gallium/auxiliary/draw/draw_vs.c @@ -149,7 +149,7 @@ draw_vs_init( struct draw_context *draw ) { draw-dump_vs = debug_get_option_gallium_dump_vs(); - if (!draw_get_option_use_llvm()) { + if (!draw-llvm) { draw-vs.tgsi.machine = tgsi_exec_machine_create(); if (!draw-vs.tgsi.machine) return FALSE; @@ -175,7 +175,7 @@ draw_vs_destroy( struct draw_context *draw ) if (draw-vs.emit_cache)
Re: [Mesa-dev] [PATCH] draw: do not use draw_get_option_use_llvm() inside draw execution paths
On 05/07/2014 11:08 AM, srol...@vmware.com wrote: From: Roland Scheidegger srol...@vmware.com 1c73e919a4b4dd79166d0633075990056f27fd28 made it possible to not allocate the tgsi machine if llvm was used. However, draw_get_option_use_llvm() is not reliable after draw context creation, since drivers can explicitly request a non-llvm draw context even if draw_get_option_use_llvm() would return true (and softpipe does just that) which leads to crashes. Thus use draw-llvm to determine if we're using llvm or not instead (and make draw-llvm available even if HAVE_LLVM is false so we don't have to put even more ifdefs). Cc: 10.2 mesa-sta...@lists.freedesktop.org --- src/gallium/auxiliary/draw/draw_context.c | 2 ++ src/gallium/auxiliary/draw/draw_gs.c | 10 +- src/gallium/auxiliary/draw/draw_private.h | 4 +--- src/gallium/auxiliary/draw/draw_vs.c | 4 ++-- src/gallium/auxiliary/draw/draw_vs_exec.c | 4 ++-- 5 files changed, 12 insertions(+), 12 deletions(-) diff --git a/src/gallium/auxiliary/draw/draw_context.c b/src/gallium/auxiliary/draw/draw_context.c index ddc305b..d7197fd 100644 --- a/src/gallium/auxiliary/draw/draw_context.c +++ b/src/gallium/auxiliary/draw/draw_context.c @@ -1000,6 +1000,8 @@ draw_get_shader_param_no_llvm(unsigned shader, enum pipe_shader_cap param) /** * XXX: Results for PIPE_SHADER_CAP_MAX_TEXTURE_SAMPLERS because there are two * different ways of setting textures, and drivers typically only support one. + * Drivers requesting a draw context explicitly without llvm must call + * draw_get_shader_param_no_llvm instead. */ int draw_get_shader_param(unsigned shader, enum pipe_shader_cap param) diff --git a/src/gallium/auxiliary/draw/draw_gs.c b/src/gallium/auxiliary/draw/draw_gs.c index 5e503ff..fc4f697 100644 --- a/src/gallium/auxiliary/draw/draw_gs.c +++ b/src/gallium/auxiliary/draw/draw_gs.c @@ -597,7 +597,7 @@ int draw_geometry_shader_run(struct draw_geometry_shader *shader, #ifdef HAVE_LLVM - if (draw_get_option_use_llvm()) { + if (shader-draw-llvm) { shader-gs_output = output_verts-verts; if (max_out_prims shader-max_out_prims) { unsigned i; @@ -674,7 +674,7 @@ int draw_geometry_shader_run(struct draw_geometry_shader *shader, void draw_geometry_shader_prepare(struct draw_geometry_shader *shader, struct draw_context *draw) { - boolean use_llvm = draw_get_option_use_llvm(); + boolean use_llvm = draw-llvm != NULL; if (!use_llvm shader shader-machine-Tokens != shader-state.tokens) { tgsi_exec_machine_bind_shader(shader-machine, shader-state.tokens, @@ -686,7 +686,7 @@ void draw_geometry_shader_prepare(struct draw_geometry_shader *shader, boolean draw_gs_init( struct draw_context *draw ) { - if (!draw_get_option_use_llvm()) { + if (!draw-llvm) { draw-gs.tgsi.machine = tgsi_exec_machine_create(); if (!draw-gs.tgsi.machine) return FALSE; @@ -715,7 +715,7 @@ draw_create_geometry_shader(struct draw_context *draw, const struct pipe_shader_state *state) { #ifdef HAVE_LLVM - boolean use_llvm = draw_get_option_use_llvm(); + boolean use_llvm = draw-llvm != NULL; struct llvm_geometry_shader *llvm_gs; #endif struct draw_geometry_shader *gs; @@ -870,7 +870,7 @@ void draw_delete_geometry_shader(struct draw_context *draw, return; } #ifdef HAVE_LLVM - if (draw_get_option_use_llvm()) { + if (draw-llvm) { struct llvm_geometry_shader *shader = llvm_geometry_shader(dgs); struct draw_gs_llvm_variant_list_item *li; diff --git a/src/gallium/auxiliary/draw/draw_private.h b/src/gallium/auxiliary/draw/draw_private.h index 801d009..783c3ef 100644 --- a/src/gallium/auxiliary/draw/draw_private.h +++ b/src/gallium/auxiliary/draw/draw_private.h @@ -47,7 +47,6 @@ #include tgsi/tgsi_scan.h #ifdef HAVE_LLVM -struct draw_llvm; struct gallivm_state; #endif @@ -69,6 +68,7 @@ struct tgsi_exec_machine; struct tgsi_sampler; struct draw_pt_front_end; struct draw_assembler; +struct draw_llvm; /** @@ -318,9 +318,7 @@ struct draw_context unsigned start_instance; unsigned start_index; -#ifdef HAVE_LLVM struct draw_llvm *llvm; -#endif /** Texture sampler and sampler view state. * Note that we have arrays indexed by shader type. At this time diff --git a/src/gallium/auxiliary/draw/draw_vs.c b/src/gallium/auxiliary/draw/draw_vs.c index eb7f4e0..dc50870 100644 --- a/src/gallium/auxiliary/draw/draw_vs.c +++ b/src/gallium/auxiliary/draw/draw_vs.c @@ -149,7 +149,7 @@ draw_vs_init( struct draw_context *draw ) { draw-dump_vs = debug_get_option_gallium_dump_vs(); - if (!draw_get_option_use_llvm()) { + if (!draw-llvm) { draw-vs.tgsi.machine = tgsi_exec_machine_create(); if (!draw-vs.tgsi.machine) return FALSE; @@ -175,7 +175,7 @@ draw_vs_destroy(
Re: [Mesa-dev] [PATCH 5/8] radeonsi: implement SAMPLEPOS fragment shader input
Am 07.05.2014 17:56, schrieb Marek Olšák: I could enable interpolation of gl_FragCoord at the sample instead of centroid and do: gl_SamplePos = fract(gl_FragCoord.xy); gl_FragCoord.xy = floor(gl_FragCoord.xy) + vec2(0.5); // center However, I wouldn't be able to get gl_FragCoord at the centroid. I'm also not sure how gl_FragCoord should be interpolated if sample shading is enabled. Intuitively I would have said it should return the per-sample position. But on second though this isn't all that obvious I guess indeed (in newer glsl versions is it even valid to use the normal interpolation qualifiers for this like centroid, flat(???), sample)? btw I guess I'm responsible for the claims there might be hw which could do this directly. This was simply based on d3d11 having an instruction to query this, and hw usually follows d3d11 pretty closely. But well maybe not in this case (d3d11 also allows you to query sample positions not just for the rasterizer but also for the textures). Roland Marek On Wed, May 7, 2014 at 5:00 PM, Ilia Mirkin imir...@alum.mit.edu wrote: On Wed, May 7, 2014 at 10:00 AM, Marek Olšák mar...@gmail.com wrote: From: Marek Olšák marek.ol...@amd.com The sample positions are read from a constant buffer. FWIW I had to do the same thing in both nv50 and nvc0. I had proposed that this should actually be done directly by the state tracker, but I think claims were made to the effect that there might be hw which can look up the positions directly. I figured radeon was such hw, but I guess not. Anyways, seems like a nice refactor would be to have mesa/st (or mesa/main) supply the constbufs and rip out the extra logic from nv50/nvc0/radeonsi. [Certainly don't have to do it now, esp now that you've done it this way.] The sample count is already handled this way by mesa core. -ilia --- src/gallium/drivers/radeon/cayman_msaa.c | 17 + src/gallium/drivers/radeon/r600_pipe_common.c | 1 + src/gallium/drivers/radeon/r600_pipe_common.h | 10 ++ src/gallium/drivers/radeonsi/si_shader.c | 23 +++ src/gallium/drivers/radeonsi/si_state.c | 25 + 5 files changed, 76 insertions(+) diff --git a/src/gallium/drivers/radeon/cayman_msaa.c b/src/gallium/drivers/radeon/cayman_msaa.c index 8727f3e..47fc5c4 100644 --- a/src/gallium/drivers/radeon/cayman_msaa.c +++ b/src/gallium/drivers/radeon/cayman_msaa.c @@ -123,6 +123,23 @@ void cayman_get_sample_position(struct pipe_context *ctx, unsigned sample_count, } } +void cayman_init_msaa(struct pipe_context *ctx) +{ + struct r600_common_context *rctx = (struct r600_common_context*)ctx; + int i; + + cayman_get_sample_position(ctx, 1, 0, rctx-sample_locations_1x[0]); + + for (i = 0; i 2; i++) + cayman_get_sample_position(ctx, 2, i, rctx-sample_locations_2x[i]); + for (i = 0; i 4; i++) + cayman_get_sample_position(ctx, 4, i, rctx-sample_locations_4x[i]); + for (i = 0; i 8; i++) + cayman_get_sample_position(ctx, 8, i, rctx-sample_locations_8x[i]); + for (i = 0; i 16; i++) + cayman_get_sample_position(ctx, 16, i, rctx-sample_locations_16x[i]); +} + void cayman_emit_msaa_sample_locs(struct radeon_winsys_cs *cs, int nr_samples) { switch (nr_samples) { diff --git a/src/gallium/drivers/radeon/r600_pipe_common.c b/src/gallium/drivers/radeon/r600_pipe_common.c index 70c4d1a..4c6cf0e 100644 --- a/src/gallium/drivers/radeon/r600_pipe_common.c +++ b/src/gallium/drivers/radeon/r600_pipe_common.c @@ -154,6 +154,7 @@ bool r600_common_context_init(struct r600_common_context *rctx, r600_init_context_texture_functions(rctx); r600_streamout_init(rctx); r600_query_init(rctx); + cayman_init_msaa(rctx-b); rctx-allocator_so_filled_size = u_suballocator_create(rctx-b, 4096, 4, 0, PIPE_USAGE_DEFAULT, TRUE); diff --git a/src/gallium/drivers/radeon/r600_pipe_common.h b/src/gallium/drivers/radeon/r600_pipe_common.h index 4c3e9ce..8862d31 100644 --- a/src/gallium/drivers/radeon/r600_pipe_common.h +++ b/src/gallium/drivers/radeon/r600_pipe_common.h @@ -357,6 +357,15 @@ struct r600_common_context { boolean saved_render_cond_cond; unsignedsaved_render_cond_mode; + /* MSAA sample locations. +* The first index is the sample index. +* The second index is the coordinate: X, Y. */ + float sample_locations_1x[1][2]; + float sample_locations_2x[2][2]; + float sample_locations_4x[4][2]; + float sample_locations_8x[8][2]; + float
Re: [Mesa-dev] [PATCH] mesa: pass target through to driver when choosing texture format
On 05/06/2014 02:33 AM, Ilia Mirkin wrote: This only matters for TextureView where the texObj's target has not been set yet, in all other instances, texObj-target should be the same as the passed-in target parameter. Signed-off-by: Ilia Mirkin imir...@alum.mit.edu --- I ran into an assert in mesa/st when choosing the texture format because the target was 0. (While trying to implement texture views.) Not sure why it cares about the target, but this seems correct. src/mesa/main/teximage.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/mesa/main/teximage.c b/src/mesa/main/teximage.c index c7f301c..845ba80 100644 --- a/src/mesa/main/teximage.c +++ b/src/mesa/main/teximage.c @@ -3024,7 +3024,7 @@ _mesa_choose_texture_format(struct gl_context *ctx, } /* choose format from scratch */ - f = ctx-Driver.ChooseTextureFormat(ctx, texObj-Target, internalFormat, + f = ctx-Driver.ChooseTextureFormat(ctx, target, internalFormat, format, type); ASSERT(f != MESA_FORMAT_NONE); return f; Reviewed-by: Brian Paul bri...@vmware.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 0/8] Almost-done ARB_sample_shading
On Wed, May 7, 2014 at 10:00 AM, Marek Olšák mar...@gmail.com wrote: This series adds support for ARB_sample_shading. The problem is that enabling the SAMPLEMASK fragment shader output by setting DB_SHADER_CONTROL.MASK_EXPORT_ENABLE hangs the GPU. The output is enabled in the same way as the depth and stencil outputs. If anybody has an idea about what I'm doing wrong, please let me know. If it makes you feel any better, I couldn't get sample mask to work on nv50/nvc0 either. I'm sure the issues are unrelated. However one thing I noticed when I was investigating was that the SAMPLEMASK output was being listed ahead of the COLOR output, and the nouveau codegen logic wasn't really ready for that... I guess COLOR had always come first before. Of course fixing that didn't fix the samplemask issue, but thought I'd mention it. -ilia Everything else works. Marek Olšák (8): radeon: split cayman_emit_msaa_state into 2 functions radeon: add basic register setup for per-sample shading radeonsi: implement set_min_samples radeonsi: implement SAMPLEID fragment shader input radeonsi: implement SAMPLEPOS fragment shader input radeonsi: interpolate varyings at sample when full sample shading is enabled radeonsi: implement SAMPLEMASK fragment shader output radeonsi: enable ARB_sample_shading Please review. Marek ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] mesa/st: pass 4-offset TG4 without lowering if supported
On Tue, May 6, 2014 at 1:36 PM, Ilia Mirkin imir...@alum.mit.edu wrote: On Tue, May 6, 2014 at 1:29 PM, Roland Scheidegger srol...@vmware.com wrote: Am 06.05.2014 17:03, schrieb Ilia Mirkin: On Tue, May 6, 2014 at 10:48 AM, Roland Scheidegger srol...@vmware.com wrote: Looks good to me. Thanks! Does that mean if also the GATHER_SM5 cap is supported you have to support 4 independent, non-constant offsets? Not 100% sure what you're asking... but yes, for ARB_gs5 to work, you have to support independent non-constant offsets. And if you have PIPE_CAP_TEXTURE_GATHER_OFFSETS enabled, you're making the claim that you can handle multiple independent offsets in a single texgather. Without the cap, the 4 offsets get lowered into 4 separate texgathers (with only one of the returned components used). With nvc0, the offsets are passed in via a register, so non-constant is never an issue. And with nv50, the offsets must be immediates (and there can be only 1 set of them), but it also has no hope of supporting all of ARB_gs5. Would it make sense to reorder the caps so the gather stuff is all together (now 5 cap bits just for this...)? The quantity of caps for texgather is a little ridiculous. I'm of the opinion that this should be the default behaviour, and it should be up to the driver to lower it into 4 texgathers if it can't handle them directly. Furthermore, this functionality is only available (via GL) with ARB_gs5, which in turn will require a whole bunch of stuff, so I don't know whether the GATHER_SM5 cap is really that useful. And for someone with a DX tracker, this functionality would again not be useful on its own, the rest of SM5 would have to be supported as well (I assume). But that's not what got implemented, and I don't care to modify radeon, which can only support 1 offset at a time. (Although I don't think the radeon impl got pushed...) I anticipate that llvmpipe doesn't care one way or another (perhaps with even a minor preference towards having it all in one instruction). If there's concensus, happy to switch this on by default and get rid of the cap :) [And also get rid of the GATHER_SM5 cap.] Well I think the point was that there's really hw which can only do simple gather (what d3d10.1 could do or arb_texture_gather would do). This hw will not be able to do other stuff from newer gl versions anyway so it should not be required to support those new features. Right. But since that hw will only ever expose ARB_texture_gather and not ARB_gpu_shader5, it will never receive a TG4 instruciton with non-const offsets or multiple offsets. So the cap to indicate that non-const or quad offsets are supported isn't really necessary, since those will only appear if ARB_gs5 support is claimed, which requires more than just the texgather stuff. (The PIPE_CAP_TEXTURE_GATHER_COMPONENTS cap _is_ necessary since it indicates ARB_texture_gather support, and the value that should be returned by some GL query about what tex gather supports.) I'm not entirely sure to what it's actually lowered but in any case llvmpipe if it implemented this definitely would want a non-lowered version. Right now, it'll get lowered to 4 texgathers, with only one of the returned 4 components used from each one. (And it can't use texfetch since the min/max offsets are different, and there's probably some other clever reason as well.) I think though some radeon hw could really do SM5 version but not independent offsets natively, though I'm not sure if it would really be all that complicated to handle it in the driver. Well, I think the claim was that SM5 doesn't actually support the 4 separate offsets, but GL4 does with textureGatherOffsets(). Also, I believe that radeon supports non-const natively, just not have 4 offsets in one instruction. Same deal with i965 (which is why that lowering pass exists in the first place). Getting back on topic... what should I do? :) Check this in with the new cap? Or just make it the default behaviour and let drivers that can't handle it do the lowering in the driver? FWIW, I believe Dave Airlie was against that, but that might have been because he was implementing it for r600, which can't handle 4 separate offsets. (BTW, was that looks good to me == R-b?) -ilia ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] i965: Single program flow for shaders with no control flow.
Matt Turner matts...@gmail.com writes: The docs say that flipping this bit on for shaders that don't do SIMD branching (i.e., non-uniform control flow) will save us some power. An easy first step is turning this on when we don't see control flow. In the future with more infrastructure in place, we can determine if all branching conditions are uniformly constant and turn on SPF. Hopefully this saves some power and extends battery life, but I'm not sure how to accurately quantify this, short of printing i915_energy_uJ before and after some workload. Even then I don't have any expectation for how much energy the GPU would use for, say a piglit run. Is 200 ~ 300 Joules reasonable (over 220 seconds)? I tried this once myself, and found no power difference (n=17). I think when SPF actually helps you is when you have flow control and the SPF flag set: then the HW gets to do one compare of pcip instead of 8 or 16. It might still be worth merging as progress toward SPF with control flow. pgpP_8_7FDFRJ.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 78403] New: query_renderer_implementation_unittest.cpp:144:4: error: expected primary-expression before ‘.’ token
https://bugs.freedesktop.org/show_bug.cgi?id=78403 Priority: medium Bug ID: 78403 Keywords: regression CC: emil.l.veli...@gmail.com, i...@freedesktop.org Assignee: mesa-dev@lists.freedesktop.org Summary: query_renderer_implementation_unittest.cpp:144:4: error: expected primary-expression before ‘.’ token Severity: blocker Classification: Unclassified OS: Linux (All) Reporter: v...@freedesktop.org Hardware: x86-64 (AMD64) Status: NEW Version: git Component: Other Product: Mesa mesa: 9ced3fc649ec04710a5f5c855bfb582b898cff83 (master 10.3.0-devel) $ make check [...] CXXquery_renderer_implementation_unittest.o query_renderer_implementation_unittest.cpp:144:4: error: expected primary-expression before ‘.’ token query_renderer_implementation_unittest.cpp:144:39: warning: extended initializer lists only available with -std=c++0x or -std=gnu++0x query_renderer_implementation_unittest.cpp:146:4: error: expected primary-expression before ‘.’ token query_renderer_implementation_unittest.cpp:147:4: error: expected primary-expression before ‘.’ token $ gcc --version gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3 Copyright (C) 2011 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. commit 51e3569573a7b3f8da0df093836761003fcdc414 Author: Emil Velikov emil.l.veli...@gmail.com Date: Wed Feb 12 21:00:02 2014 + glx/tests: explicitly set __DRI2rendererQueryExtension members While we're here use the typcast'ed name and constify. Signed-off-by: Emil Velikov emil.l.veli...@gmail.com Reviewed-by: Ian Romanick ian.d.roman...@intel.com -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v3 2/2] egl: Add EGL_CHROMIUM_sync_control extension.
Sarah Sharp sarah.a.sh...@linux.intel.com writes: Chromium defined a new GL extension (that isn't registered with Khronos). We need to add an EGL extension for it, so we can migrate ChromeOS on Intel systems to use EGL instead of GLX. http://git.chromium.org/gitweb/?p=chromium/src/third_party/khronos.git;a=commitdiff;h=27cbfdab35c601f70aa150581ad1448d0401f447 The EGL_CHROMIUM_sync_control extension is similar to the GLX extension OML_sync_control, but only defines one function, eglGetSyncValuesCHROMIUM, which is equivalent to glXGetSyncValuesOML. diff --git a/src/egl/drivers/dri2/egl_dri2_fallbacks.h b/src/egl/drivers/dri2/egl_dri2_fallbacks.h index a5cf344..9cba001 100644 --- a/src/egl/drivers/dri2/egl_dri2_fallbacks.h +++ b/src/egl/drivers/dri2/egl_dri2_fallbacks.h @@ -98,3 +98,11 @@ dri2_fallback_create_wayland_buffer_from_image(_EGLDriver *drv, { return NULL; } + +static inline EGLBoolean +dri2_fallback_get_sync_values(_EGLDisplay *dpy, _EGLSurface *surf, + EGLuint64KHR *ust, EGLuint64KHR *msc, + EGLuint64KHR *sbc) +{ + return EGL_FALSE; +} I've stared at the code in this file, trying to figure out when it would ever be called, and failed. But you're following an existing pattern, so it's fine. diff --git a/src/egl/main/eglapi.c b/src/egl/main/eglapi.c index 219d8e6..27d0802 100644 --- a/src/egl/main/eglapi.c +++ b/src/egl/main/eglapi.c @@ -1086,6 +1086,7 @@ eglGetProcAddress(const char *procname) { eglGetPlatformDisplayEXT, (_EGLProc) eglGetPlatformDisplayEXT }, { eglCreatePlatformWindowSurfaceEXT, (_EGLProc) eglCreatePlatformWindowSurfaceEXT }, { eglCreatePlatformPixmapSurfaceEXT, (_EGLProc) eglCreatePlatformPixmapSurfaceEXT }, + { eglGetSyncValuesCHROMIUM, (_EGLProc) eglGetSyncValuesCHROMIUM }, { NULL, NULL } }; EGLint i; @@ -1751,3 +1752,25 @@ eglPostSubBufferNV(EGLDisplay dpy, EGLSurface surface, RETURN_EGL_EVAL(disp, ret); } + +EGLBoolean EGLAPIENTRY +eglGetSyncValuesCHROMIUM(EGLDisplay display, EGLSurface surface, + EGLuint64KHR *ust, EGLuint64KHR *msc, + EGLuint64KHR *sbc) +{ + _EGLDisplay *disp = _eglLockDisplay(display); + _EGLSurface *surf = _eglLookupSurface(surface, disp); + _EGLDriver *drv; + EGLBoolean ret; + + _EGL_CHECK_SURFACE(disp, surf, EGL_FALSE, drv); + if (!disp-Extensions.CHROMIUM_sync_control) + RETURN_EGL_EVAL(disp, EGL_FALSE); + + if (!ust || !msc || !sbc) + RETURN_EGL_ERROR(disp, EGL_BAD_PARAMETER, EGL_FALSE); This is the sort of thing that makes me uncomfortable merging extension support without a spec. Should we throw an error on NULL, or just not return a value there? It looks like GLX_OML_sync_control doesn't specify any particular behavior for that. Given that, throwing an error instead of crashing seems nice. + ret = drv-API.GetSyncValuesCHROMIUM(disp, surf, ust, msc, sbc); + + RETURN_EGL_EVAL(disp, ret); +} This patch is: Reviewed-by: Eric Anholt e...@anholt.net I'm a little weirded out by the custom header for what is just an EGL extension (but unspecced) in patch 1. I'll leave the question of merging that up to Ian. pgpwlcFt6fxOO.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] mesa/st: pass 4-offset TG4 without lowering if supported
Am 07.05.2014 20:33, schrieb Ilia Mirkin: On Tue, May 6, 2014 at 1:36 PM, Ilia Mirkin imir...@alum.mit.edu wrote: On Tue, May 6, 2014 at 1:29 PM, Roland Scheidegger srol...@vmware.com wrote: Am 06.05.2014 17:03, schrieb Ilia Mirkin: On Tue, May 6, 2014 at 10:48 AM, Roland Scheidegger srol...@vmware.com wrote: Looks good to me. Thanks! Does that mean if also the GATHER_SM5 cap is supported you have to support 4 independent, non-constant offsets? Not 100% sure what you're asking... but yes, for ARB_gs5 to work, you have to support independent non-constant offsets. And if you have PIPE_CAP_TEXTURE_GATHER_OFFSETS enabled, you're making the claim that you can handle multiple independent offsets in a single texgather. Without the cap, the 4 offsets get lowered into 4 separate texgathers (with only one of the returned components used). With nvc0, the offsets are passed in via a register, so non-constant is never an issue. And with nv50, the offsets must be immediates (and there can be only 1 set of them), but it also has no hope of supporting all of ARB_gs5. Would it make sense to reorder the caps so the gather stuff is all together (now 5 cap bits just for this...)? The quantity of caps for texgather is a little ridiculous. I'm of the opinion that this should be the default behaviour, and it should be up to the driver to lower it into 4 texgathers if it can't handle them directly. Furthermore, this functionality is only available (via GL) with ARB_gs5, which in turn will require a whole bunch of stuff, so I don't know whether the GATHER_SM5 cap is really that useful. And for someone with a DX tracker, this functionality would again not be useful on its own, the rest of SM5 would have to be supported as well (I assume). But that's not what got implemented, and I don't care to modify radeon, which can only support 1 offset at a time. (Although I don't think the radeon impl got pushed...) I anticipate that llvmpipe doesn't care one way or another (perhaps with even a minor preference towards having it all in one instruction). If there's concensus, happy to switch this on by default and get rid of the cap :) [And also get rid of the GATHER_SM5 cap.] Well I think the point was that there's really hw which can only do simple gather (what d3d10.1 could do or arb_texture_gather would do). This hw will not be able to do other stuff from newer gl versions anyway so it should not be required to support those new features. Right. But since that hw will only ever expose ARB_texture_gather and not ARB_gpu_shader5, it will never receive a TG4 instruciton with non-const offsets or multiple offsets. So the cap to indicate that non-const or quad offsets are supported isn't really necessary, since those will only appear if ARB_gs5 support is claimed, which requires more than just the texgather stuff. (The PIPE_CAP_TEXTURE_GATHER_COMPONENTS cap _is_ necessary since it indicates ARB_texture_gather support, and the value that should be returned by some GL query about what tex gather supports.) I'm not entirely sure to what it's actually lowered but in any case llvmpipe if it implemented this definitely would want a non-lowered version. Right now, it'll get lowered to 4 texgathers, with only one of the returned 4 components used from each one. (And it can't use texfetch since the min/max offsets are different, and there's probably some other clever reason as well.) I think though some radeon hw could really do SM5 version but not independent offsets natively, though I'm not sure if it would really be all that complicated to handle it in the driver. Well, I think the claim was that SM5 doesn't actually support the 4 separate offsets, but GL4 does with textureGatherOffsets(). Also, I believe that radeon supports non-const natively, just not have 4 offsets in one instruction. Same deal with i965 (which is why that lowering pass exists in the first place). Getting back on topic... what should I do? :) Check this in with the new cap? Or just make it the default behaviour and let drivers that can't handle it do the lowering in the driver? FWIW, I believe Dave Airlie was against that, but that might have been because he was implementing it for r600, which can't handle 4 separate offsets. (BTW, was that looks good to me == R-b?) Yes, that is Reviewed-by: Roland Scheidegger srol...@vmware.com I think the code is ok as is ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] mesa: Fix MaxNumLayers for 1D array textures.
1D array targets store the number of slices in the Height field. Cc: 10.2 10.1 10.0 mesa-sta...@lists.freedesktop.org Signed-off-by: Kenneth Graunke kenn...@whitecape.org --- src/mesa/main/fbobject.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/src/mesa/main/fbobject.c b/src/mesa/main/fbobject.c index ca16ae1..97538bc 100644 --- a/src/mesa/main/fbobject.c +++ b/src/mesa/main/fbobject.c @@ -1058,6 +1058,8 @@ _mesa_test_framebuffer_completeness(struct gl_context *ctx, if (att-Layered) { if (att_tex_target == GL_TEXTURE_CUBE_MAP) att_layer_count = 6; + else if (att_tex_target == GL_TEXTURE_1D_ARRAY) +att_layer_count = att-Renderbuffer-Height; else att_layer_count = att-Renderbuffer-Depth; } else { -- 1.9.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] i965: Fix depth (array slices) computation for 1D_ARRAY render targets.
1D array targets store the number of slices in the Height field. Fixes Piglit's spec/!OpenGL 3.2/layered-rendering/clear-color-all-types 1d_array single_level, at least when used with Meta clears. Cc: 10.2 10.1 10.0 mesa-sta...@lists.freedesktop.org Signed-off-by: Kenneth Graunke kenn...@whitecape.org --- src/mesa/drivers/dri/i965/gen7_wm_surface_state.c | 2 ++ src/mesa/drivers/dri/i965/gen8_surface_state.c| 3 +++ 2 files changed, 5 insertions(+) diff --git a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c index d71a1d1..f051024 100644 --- a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c +++ b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c @@ -459,6 +459,8 @@ gen7_update_renderbuffer_surface(struct brw_context *brw, const uint8_t mocs = GEN7_MOCS_L3; GLenum gl_target = rb-TexImage ? rb-TexImage-TexObject-Target : GL_TEXTURE_2D; + if (gl_target == GL_TEXTURE_1D_ARRAY) + depth = MAX2(rb-Height, 1); uint32_t surf_index = brw-wm.prog_data-binding_table.render_target_start + unit; diff --git a/src/mesa/drivers/dri/i965/gen8_surface_state.c b/src/mesa/drivers/dri/i965/gen8_surface_state.c index 564d275..f00b354 100644 --- a/src/mesa/drivers/dri/i965/gen8_surface_state.c +++ b/src/mesa/drivers/dri/i965/gen8_surface_state.c @@ -281,6 +281,9 @@ gen8_update_renderbuffer_surface(struct brw_context *brw, GLenum gl_target = rb-TexImage ? rb-TexImage-TexObject-Target : GL_TEXTURE_2D; + if (gl_target == GL_TEXTURE_1D_ARRAY) + depth = MAX2(rb-Height, 1); + uint32_t surf_index = brw-wm.prog_data-binding_table.render_target_start + unit; -- 1.9.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 78403] query_renderer_implementation_unittest.cpp:144:4: error: expected primary-expression before ‘.’ token
https://bugs.freedesktop.org/show_bug.cgi?id=78403 --- Comment #1 from Emil Velikov emil.l.veli...@gmail.com --- Created attachment 98641 -- https://bugs.freedesktop.org/attachment.cgi?id=98641action=edit glx/tests: Partially revert commit 51e3569573a7b3f8da0df093836761003fcdc414 Vinson, did you receive the email with a test request wrt this commit/the whole series ? I was hoping it we could minimize the amount of with such annoying/trivial issues :) With that said, a quick look at the C++ standard indicates that the patch is not strictly legal and to make things better, gcc 4.9.0 builds this code like a charm. Just revert the offending hunk. -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] mesa/st: pass 4-offset TG4 without lowering if supported
On 8 May 2014 04:33, Ilia Mirkin imir...@alum.mit.edu wrote: On Tue, May 6, 2014 at 1:36 PM, Ilia Mirkin imir...@alum.mit.edu wrote: On Tue, May 6, 2014 at 1:29 PM, Roland Scheidegger srol...@vmware.com wrote: Am 06.05.2014 17:03, schrieb Ilia Mirkin: On Tue, May 6, 2014 at 10:48 AM, Roland Scheidegger srol...@vmware.com wrote: Looks good to me. Thanks! Does that mean if also the GATHER_SM5 cap is supported you have to support 4 independent, non-constant offsets? Not 100% sure what you're asking... but yes, for ARB_gs5 to work, you have to support independent non-constant offsets. And if you have PIPE_CAP_TEXTURE_GATHER_OFFSETS enabled, you're making the claim that you can handle multiple independent offsets in a single texgather. Without the cap, the 4 offsets get lowered into 4 separate texgathers (with only one of the returned components used). With nvc0, the offsets are passed in via a register, so non-constant is never an issue. And with nv50, the offsets must be immediates (and there can be only 1 set of them), but it also has no hope of supporting all of ARB_gs5. Would it make sense to reorder the caps so the gather stuff is all together (now 5 cap bits just for this...)? The quantity of caps for texgather is a little ridiculous. I'm of the opinion that this should be the default behaviour, and it should be up to the driver to lower it into 4 texgathers if it can't handle them directly. Furthermore, this functionality is only available (via GL) with ARB_gs5, which in turn will require a whole bunch of stuff, so I don't know whether the GATHER_SM5 cap is really that useful. And for someone with a DX tracker, this functionality would again not be useful on its own, the rest of SM5 would have to be supported as well (I assume). But that's not what got implemented, and I don't care to modify radeon, which can only support 1 offset at a time. (Although I don't think the radeon impl got pushed...) I anticipate that llvmpipe doesn't care one way or another (perhaps with even a minor preference towards having it all in one instruction). If there's concensus, happy to switch this on by default and get rid of the cap :) [And also get rid of the GATHER_SM5 cap.] Well I think the point was that there's really hw which can only do simple gather (what d3d10.1 could do or arb_texture_gather would do). This hw will not be able to do other stuff from newer gl versions anyway so it should not be required to support those new features. Right. But since that hw will only ever expose ARB_texture_gather and not ARB_gpu_shader5, it will never receive a TG4 instruciton with non-const offsets or multiple offsets. So the cap to indicate that non-const or quad offsets are supported isn't really necessary, since those will only appear if ARB_gs5 support is claimed, which requires more than just the texgather stuff. (The PIPE_CAP_TEXTURE_GATHER_COMPONENTS cap _is_ necessary since it indicates ARB_texture_gather support, and the value that should be returned by some GL query about what tex gather supports.) I'm not entirely sure to what it's actually lowered but in any case llvmpipe if it implemented this definitely would want a non-lowered version. Right now, it'll get lowered to 4 texgathers, with only one of the returned 4 components used from each one. (And it can't use texfetch since the min/max offsets are different, and there's probably some other clever reason as well.) I think though some radeon hw could really do SM5 version but not independent offsets natively, though I'm not sure if it would really be all that complicated to handle it in the driver. Well, I think the claim was that SM5 doesn't actually support the 4 separate offsets, but GL4 does with textureGatherOffsets(). Also, I believe that radeon supports non-const natively, just not have 4 offsets in one instruction. Same deal with i965 (which is why that lowering pass exists in the first place). Getting back on topic... what should I do? :) Check this in with the new cap? Or just make it the default behaviour and let drivers that can't handle it do the lowering in the driver? FWIW, I believe Dave Airlie was against that, but that might have been because he was implementing it for r600, which can't handle 4 separate offsets. (BTW, was that looks good to me == R-b?) ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] mesa/st: pass 4-offset TG4 without lowering if supported
Getting back on topic... what should I do? :) Check this in with the new cap? Or just make it the default behaviour and let drivers that can't handle it do the lowering in the driver? FWIW, I believe Dave Airlie was against that, but that might have been because he was implementing it for r600, which can't handle 4 separate offsets. (BTW, was that looks good to me == R-b?) Do what you did seems fine, lowering this in r600 was quite a lot messier than doing it in the GLSL level. Dave. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] mesa: Fix MaxNumLayers for 1D array textures.
On 05/07/2014 03:35 PM, Kenneth Graunke wrote: 1D array targets store the number of slices in the Height field. Cc: 10.2 10.1 10.0 mesa-sta...@lists.freedesktop.org Signed-off-by: Kenneth Graunke kenn...@whitecape.org --- src/mesa/main/fbobject.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/src/mesa/main/fbobject.c b/src/mesa/main/fbobject.c index ca16ae1..97538bc 100644 --- a/src/mesa/main/fbobject.c +++ b/src/mesa/main/fbobject.c @@ -1058,6 +1058,8 @@ _mesa_test_framebuffer_completeness(struct gl_context *ctx, if (att-Layered) { if (att_tex_target == GL_TEXTURE_CUBE_MAP) att_layer_count = 6; + else if (att_tex_target == GL_TEXTURE_1D_ARRAY) +att_layer_count = att-Renderbuffer-Height; else att_layer_count = att-Renderbuffer-Depth; } else { Reviewed-by: Brian Paul bri...@vmware.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] mesa: Fix MaxNumLayers for 1D array textures.
Reviewed-by: Marek Olšák marek.ol...@amd.com Marek On Wed, May 7, 2014 at 11:35 PM, Kenneth Graunke kenn...@whitecape.org wrote: 1D array targets store the number of slices in the Height field. Cc: 10.2 10.1 10.0 mesa-sta...@lists.freedesktop.org Signed-off-by: Kenneth Graunke kenn...@whitecape.org --- src/mesa/main/fbobject.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/src/mesa/main/fbobject.c b/src/mesa/main/fbobject.c index ca16ae1..97538bc 100644 --- a/src/mesa/main/fbobject.c +++ b/src/mesa/main/fbobject.c @@ -1058,6 +1058,8 @@ _mesa_test_framebuffer_completeness(struct gl_context *ctx, if (att-Layered) { if (att_tex_target == GL_TEXTURE_CUBE_MAP) att_layer_count = 6; + else if (att_tex_target == GL_TEXTURE_1D_ARRAY) +att_layer_count = att-Renderbuffer-Height; else att_layer_count = att-Renderbuffer-Depth; } else { -- 1.9.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] mesa: Fix MaxNumLayers for 1D array textures.
Series Reviewed-by: Jordan Justen jordan.l.jus...@intel.com On Wed, May 7, 2014 at 3:22 PM, Marek Olšák mar...@gmail.com wrote: Reviewed-by: Marek Olšák marek.ol...@amd.com Marek On Wed, May 7, 2014 at 11:35 PM, Kenneth Graunke kenn...@whitecape.org wrote: 1D array targets store the number of slices in the Height field. Cc: 10.2 10.1 10.0 mesa-sta...@lists.freedesktop.org Signed-off-by: Kenneth Graunke kenn...@whitecape.org --- src/mesa/main/fbobject.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/src/mesa/main/fbobject.c b/src/mesa/main/fbobject.c index ca16ae1..97538bc 100644 --- a/src/mesa/main/fbobject.c +++ b/src/mesa/main/fbobject.c @@ -1058,6 +1058,8 @@ _mesa_test_framebuffer_completeness(struct gl_context *ctx, if (att-Layered) { if (att_tex_target == GL_TEXTURE_CUBE_MAP) att_layer_count = 6; + else if (att_tex_target == GL_TEXTURE_1D_ARRAY) +att_layer_count = att-Renderbuffer-Height; else att_layer_count = att-Renderbuffer-Depth; } else { -- 1.9.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [Nouveau] [PATCH] nv50/ir/gk110: fix set with f32 dest
On Wed, May 7, 2014 at 9:57 AM, Ilia Mirkin imir...@alum.mit.edu wrote: Should fix SGE/SSG instructions, which were previously getting integer 0/-1 values. I hit the same issue on Maxwell. Soo... Signed-off-by: Ilia Mirkin imir...@alum.mit.edu Reviewed-by: Ben Skeggs bske...@redhat.com --- src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp | 3 +++ 1 file changed, 3 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp index 5992c54..b8d0d3e 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp @@ -915,6 +915,9 @@ CodeEmitterGK110::emitSET(const CmpInstruction *i) modNegAbsF32_3b(i, 1); } FTZ_(3a); + + if (i-dType == TYPE_F32) + code[1] |= 1 23; } if (i-sType == TYPE_S32) code[1] |= 1 19; -- 1.8.3.2 ___ Nouveau mailing list nouv...@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/nouveau ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] Fix R600_DEBUG=vm output (wrong base).
Signed-off-by: Darren Salt devs...@moreofthesa.me.uk --- src/gallium/drivers/radeon/r600_buffer_common.c | 2 +- src/gallium/drivers/radeon/r600_texture.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) This applies to master, 10.1, 10.2, probably older versions. diff --git a/src/gallium/drivers/radeon/r600_buffer_common.c b/src/gallium/drivers/radeon/r600_buffer_common.c index 805756f..822bb14 100644 --- a/src/gallium/drivers/radeon/r600_buffer_common.c +++ b/src/gallium/drivers/radeon/r600_buffer_common.c @@ -157,7 +157,7 @@ bool r600_init_resource(struct r600_common_screen *rscreen, util_range_set_empty(res-valid_buffer_range); if (rscreen-debug_flags DBG_VM res-b.b.target == PIPE_BUFFER) { - fprintf(stderr, VM start=0x%PRIu64 end=0x%PRIu64 | Buffer %u bytes\n, + fprintf(stderr, VM start=0x%PRIx64 end=0x%PRIx64 | Buffer %u bytes\n, r600_resource_va(rscreen-b, res-b.b), r600_resource_va(rscreen-b, res-b.b) + res-buf-size, res-buf-size); diff --git a/src/gallium/drivers/radeon/r600_texture.c b/src/gallium/drivers/radeon/r600_texture.c index e30d933..f34a5b5 100644 --- a/src/gallium/drivers/radeon/r600_texture.c +++ b/src/gallium/drivers/radeon/r600_texture.c @@ -666,7 +666,7 @@ r600_texture_create_object(struct pipe_screen *screen, rtex-cmask.base_address_reg = (va + rtex-cmask.offset) 8; if (rscreen-debug_flags DBG_VM) { - fprintf(stderr, VM start=0x%PRIu64 end=0x%PRIu64 | Texture %ix%ix%i, %i levels, %i samples, %s\n, + fprintf(stderr, VM start=0x%PRIx64 end=0x%PRIx64 | Texture %ix%ix%i, %i levels, %i samples, %s\n, r600_resource_va(screen, rtex-resource.b.b), r600_resource_va(screen, rtex-resource.b.b) + rtex-resource.buf-size, base-width0, base-height0, util_max_layer(base, 0)+1, base-last_level+1, -- | _ | Darren Salt, using Debian GNU/Linux (and Android) | ( ) | | X | ASCII Ribbon campaign against HTML e-mail | / \ | http://www.asciiribbon.org/ ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] odd translation from glsl to tgsi for ir_unop_any_nequal
So... this shader (from generated_tests/spec/glsl-1.10/execution/built-in-functions/fs-op-eq-mat2-mat2.shader_test): uniform mat2 arg0; uniform mat2 arg1; void main() { bool result = (arg0 == arg1); gl_FragColor = vec4(result, 0.0, 0.0, 0.0); } Which becomes the following IR: ( (declare (shader_out ) vec4 gl_FragColor) (declare (temporary ) vec4 gl_FragColor) (declare (uniform ) mat2 arg0) (declare (uniform ) mat2 arg1) (function main (signature void (parameters ) ( (declare (temporary ) vec4 vec_ctor) (assign (yzw) (var_ref vec_ctor) (constant vec3 (0.0 0.0 0.0)) ) (declare (temporary ) bvec2 mat_cmp_bvec) (assign (x) (var_ref mat_cmp_bvec) (expression bool any_nequal (array_ref (var_ref arg1) (constant int (0)) ) (array_ref (var_ref arg0) (constant int (0)) ) ) ) (assign (y) (var_ref mat_cmp_bvec) (expression bool any_nequal (array_ref (var_ref arg1) (constant int (1)) ) (array_ref (var_ref arg0) (constant int (1)) ) ) ) (assign (x) (var_ref vec_ctor) (expression float b2f (expression bool ! (expression bool any (var_ref mat_cmp_bvec) ) ) ) ) (assign (xyzw) (var_ref gl_FragColor) (var_ref vec_ctor) ) (assign (xyzw) (var_ref gl_FragColor@4) (var_ref gl_FragColor) ) )) ) When converted to TGS becomes: FRAG PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1 DCL OUT[0], COLOR DCL CONST[0..3] DCL TEMP[0..2], LOCAL IMM[0] FLT32 {0., 1., 0., 0.} IMM[1] INT32 {0, 0, 0, 0} 0: MOV TEMP[0].yzw, IMM[0]. 1: FSNE TEMP[1].xy, CONST[2].xyyy, CONST[0].xyyy 2: OR TEMP[1].x, TEMP[1]., TEMP[1]. 3: FSNE TEMP[2].xy, CONST[3].xyyy, CONST[1].xyyy 4: OR TEMP[2].x, TEMP[2]., TEMP[2]. 5: MOV TEMP[1].y, TEMP[2]. 6: DP2 TEMP[1].x, TEMP[1].xyyy, TEMP[1].xyyy 7: USNE TEMP[1].x, TEMP[1]., IMM[1]. 8: NOT TEMP[1].x, TEMP[1]. 9: AND TEMP[0].x, TEMP[1]., IMM[0]. 10: MOV OUT[0], TEMP[0] 11: END Note that FSNE/OR are used, implying that the integer version of these is expected. However then it goes on to use DP2, which, as I understand, does a floating point multiply + add. Now, this _happens_ to work out, since the integer representations of float 0 and int 0 are the same, and those are really the only possilibities we care about. However this seems really dodgy... wouldn't it be clearer to use either SNE + OR (which would still work!) + DP2, or alternatively AND them all together instead of SNE/DP2? This seems to come in via ir_unop_any_nequal. IMO the latter would be better since it keeps things in integer space, and presumably AND's are cheaper than fmul/fadd. I noticed this because nouveau's codegen logic isn't able to optimize this intelligently and I was trying to figure out why. Thoughts? -ilia ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] tgsi: support parsing texture offsets from text tgsi shaders
Signed-off-by: Ilia Mirkin imir...@alum.mit.edu --- src/gallium/auxiliary/tgsi/tgsi_text.c | 53 ++ 1 file changed, 48 insertions(+), 5 deletions(-) diff --git a/src/gallium/auxiliary/tgsi/tgsi_text.c b/src/gallium/auxiliary/tgsi/tgsi_text.c index 2b2e7d5..7e50d8d 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_text.c +++ b/src/gallium/auxiliary/tgsi/tgsi_text.c @@ -735,8 +735,9 @@ parse_dst_operand( static boolean parse_optional_swizzle( struct translate_ctx *ctx, - uint swizzle[4], - boolean *parsed_swizzle ) + uint *swizzle, + boolean *parsed_swizzle, + int components) { const char *cur = ctx-cur; @@ -748,7 +749,7 @@ parse_optional_swizzle( cur++; eat_opt_white( cur ); - for (i = 0; i 4; i++) { + for (i = 0; i components; i++) { if (uprcase( *cur ) == 'X') swizzle[i] = TGSI_SWIZZLE_X; else if (uprcase( *cur ) == 'Y') @@ -816,7 +817,7 @@ parse_src_operand( /* Parse optional swizzle. */ - if (parse_optional_swizzle( ctx, swizzle, parsed_swizzle )) { + if (parse_optional_swizzle( ctx, swizzle, parsed_swizzle, 4 )) { if (parsed_swizzle) { src-Register.SwizzleX = swizzle[0]; src-Register.SwizzleY = swizzle[1]; @@ -839,6 +840,35 @@ parse_src_operand( } static boolean +parse_texoffset_operand( + struct translate_ctx *ctx, + struct tgsi_texture_offset *src ) +{ + uint file; + uint swizzle[3]; + boolean parsed_swizzle; + struct parsed_bracket bracket; + + if (!parse_register_src(ctx, file, bracket)) + return FALSE; + + src-File = file; + src-Index = bracket.index; + + /* Parse optional swizzle. +*/ + if (parse_optional_swizzle( ctx, swizzle, parsed_swizzle, 3 )) { + if (parsed_swizzle) { + src-SwizzleX = swizzle[0]; + src-SwizzleY = swizzle[1]; + src-SwizzleZ = swizzle[2]; + } + } + + return TRUE; +} + +static boolean match_inst(const char **pcur, unsigned *saturate, const struct tgsi_opcode_info *info) @@ -904,7 +934,7 @@ parse_instruction( if (!parse_register_1d( ctx, file, index )) return FALSE; - if (parse_optional_swizzle( ctx, swizzle, parsed_swizzle )) { + if (parse_optional_swizzle( ctx, swizzle, parsed_swizzle, 4 )) { if (parsed_swizzle) { inst.Predicate.SwizzleX = swizzle[0]; inst.Predicate.SwizzleY = swizzle[1]; @@ -1003,6 +1033,19 @@ parse_instruction( cur = ctx-cur; eat_opt_white( cur ); + for (i = 0; info-is_tex *cur == ','; i++) { + cur++; + eat_opt_white( cur ); + ctx-cur = cur; + if (!parse_texoffset_operand( ctx, inst.TexOffsets[i] )) +return FALSE; + cur = ctx-cur; + eat_opt_white( cur ); + } + inst.Texture.NumOffsets = i; + + cur = ctx-cur; + eat_opt_white( cur ); if (info-is_branch *cur == ':') { uint target; -- 1.8.3.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 78393] Black zebra like lines while playing games on open source drivers
https://bugs.freedesktop.org/show_bug.cgi?id=78393 --- Comment #1 from Michel Dänzer mic...@daenzer.net --- You mentioned on the xorg mailing list that this worked properly two months ago, so there seems to have been a regression. Can you try bisecting, or at least determining which versions of Mesa are broken / working? -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] odd translation from glsl to tgsi for ir_unop_any_nequal
On Wed, May 7, 2014 at 8:38 PM, Ilia Mirkin imir...@alum.mit.edu wrote: So... this shader (from generated_tests/spec/glsl-1.10/execution/built-in-functions/fs-op-eq-mat2-mat2.shader_test): uniform mat2 arg0; uniform mat2 arg1; void main() { bool result = (arg0 == arg1); gl_FragColor = vec4(result, 0.0, 0.0, 0.0); } Which becomes the following IR: ( (declare (shader_out ) vec4 gl_FragColor) (declare (temporary ) vec4 gl_FragColor) (declare (uniform ) mat2 arg0) (declare (uniform ) mat2 arg1) (function main (signature void (parameters ) ( (declare (temporary ) vec4 vec_ctor) (assign (yzw) (var_ref vec_ctor) (constant vec3 (0.0 0.0 0.0)) ) (declare (temporary ) bvec2 mat_cmp_bvec) (assign (x) (var_ref mat_cmp_bvec) (expression bool any_nequal (array_ref (var_ref arg1) (constant int (0)) ) (array_ref (var_ref arg0) (constant int (0)) ) ) ) (assign (y) (var_ref mat_cmp_bvec) (expression bool any_nequal (array_ref (var_ref arg1) (constant int (1)) ) (array_ref (var_ref arg0) (constant int (1)) ) ) ) (assign (x) (var_ref vec_ctor) (expression float b2f (expression bool ! (expression bool any (var_ref mat_cmp_bvec) ) ) ) ) (assign (xyzw) (var_ref gl_FragColor) (var_ref vec_ctor) ) (assign (xyzw) (var_ref gl_FragColor@4) (var_ref gl_FragColor) ) )) ) When converted to TGS becomes: FRAG PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1 DCL OUT[0], COLOR DCL CONST[0..3] DCL TEMP[0..2], LOCAL IMM[0] FLT32 {0., 1., 0., 0.} IMM[1] INT32 {0, 0, 0, 0} 0: MOV TEMP[0].yzw, IMM[0]. 1: FSNE TEMP[1].xy, CONST[2].xyyy, CONST[0].xyyy 2: OR TEMP[1].x, TEMP[1]., TEMP[1]. 3: FSNE TEMP[2].xy, CONST[3].xyyy, CONST[1].xyyy 4: OR TEMP[2].x, TEMP[2]., TEMP[2]. 5: MOV TEMP[1].y, TEMP[2]. 6: DP2 TEMP[1].x, TEMP[1].xyyy, TEMP[1].xyyy 7: USNE TEMP[1].x, TEMP[1]., IMM[1]. 8: NOT TEMP[1].x, TEMP[1]. 9: AND TEMP[0].x, TEMP[1]., IMM[0]. 10: MOV OUT[0], TEMP[0] 11: END Note that FSNE/OR are used, implying that the integer version of these is expected. However then it goes on to use DP2, which, as I understand, does a floating point multiply + add. Now, this _happens_ to work out, since the integer representations of float 0 and int 0 are the same, and those are really the only possilibities we care about. However this seems really dodgy... wouldn't it be clearer to use either SNE + OR (which would still work!) + DP2, or alternatively AND them all together instead of SNE/DP2? This seems to come in via ir_unop_any_nequal. IMO the latter would be better since it keeps Erm, sorry -- the email subject and this sentence isn't _quite_ accurate. That should be ir_unop_any. ir_binop_any_nequal is what generates the FSNE/OR' combos. But everything else still holds :) things in integer space, and presumably AND's are cheaper than fmul/fadd. I noticed this because nouveau's codegen logic isn't able to optimize this intelligently and I was trying to figure out why. Thoughts? -ilia ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/4] i965/Gen7: Set up layer constraints properly for renderbuffers
There were a few problems here, which mostly just broke layered rendering into a view: - Render target view extent was always set to be == depth. This is benign for non-layered-rendering, but allows writes off the end of the render target for layered rendering, which ends badly. - Layered rendering did not honor the mt_layer setting, so would not properly handle MinLayer being set on a view. Signed-off-by: Chris Forbes chr...@ijw.co.nz --- src/mesa/drivers/dri/i965/gen7_wm_surface_state.c | 17 +++-- 1 file changed, 7 insertions(+), 10 deletions(-) diff --git a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c index d71a1d1..e365860 100644 --- a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c +++ b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c @@ -454,9 +454,11 @@ gen7_update_renderbuffer_surface(struct brw_context *brw, mesa_format rb_format = _mesa_get_render_format(ctx, intel_rb_format(irb)); uint32_t surftype; bool is_array = false; - int depth = MAX2(rb-Depth, 1); - int min_array_element; + int depth = irb-layer_count; const uint8_t mocs = GEN7_MOCS_L3; + + int min_array_element = irb-mt_layer / MAX2(mt-num_samples, 1); + GLenum gl_target = rb-TexImage ? rb-TexImage-TexObject-Target : GL_TEXTURE_2D; @@ -486,20 +488,15 @@ gen7_update_renderbuffer_surface(struct brw_context *brw, is_array = true; depth *= 6; break; + case GL_TEXTURE_3D: + depth = rb-Depth; + /* fallthrough */ default: surftype = translate_tex_target(gl_target); is_array = _mesa_tex_target_is_array(gl_target); break; } - if (layered) { - min_array_element = 0; - } else if (irb-mt-num_samples 1) { - min_array_element = irb-mt_layer / irb-mt-num_samples; - } else { - min_array_element = irb-mt_layer; - } - surf[0] = surftype BRW_SURFACE_TYPE_SHIFT | format BRW_SURFACE_FORMAT_SHIFT | (irb-mt-array_spacing_lod0 ? GEN7_SURFACE_ARYSPC_LOD0 -- 1.9.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 4/4] i965/Gen8: Set up layer constraints properly for depth buffers
Same issues as the previous commit fixed for Gen7: - Bogus physical-logical layer conversion; depth/stencil surfaces are still IMS layout on Gen8. - mt_layer ignored in layered rendering case, which breaks handling of views with MinLayer. - Render target array extent not set correctly for arrays. I'm not able to test this one since I can't get a Broadwell yet, but it's the same set of fixes as for Gen7. Signed-off-by: Chris Forbes chr...@ijw.co.nz --- src/mesa/drivers/dri/i965/gen8_depth_state.c | 15 ++- 1 file changed, 6 insertions(+), 9 deletions(-) diff --git a/src/mesa/drivers/dri/i965/gen8_depth_state.c b/src/mesa/drivers/dri/i965/gen8_depth_state.c index f6031e9..aeadfef 100644 --- a/src/mesa/drivers/dri/i965/gen8_depth_state.c +++ b/src/mesa/drivers/dri/i965/gen8_depth_state.c @@ -168,7 +168,7 @@ gen8_emit_depth_stencil_hiz(struct brw_context *brw, rb = (struct gl_renderbuffer *) irb; if (rb) { - depth = MAX2(rb-Depth, 1); + depth = irb-layer_count; if (rb-TexImage) gl_target = rb-TexImage-TexObject-Target; } @@ -184,19 +184,16 @@ gen8_emit_depth_stencil_hiz(struct brw_context *brw, surftype = BRW_SURFACE_2D; depth *= 6; break; + case GL_TEXTURE_3D: + assert(rb); + depth = rb-Depth; + /* fallthrough */ default: surftype = translate_tex_target(gl_target); break; } - if (fb-MaxNumLayers 0 || !irb) { - min_array_element = 0; - } else if (irb-mt-num_samples 1) { - /* Convert physical to logical layer. */ - min_array_element = irb-mt_layer / irb-mt-num_samples; - } else { - min_array_element = irb-mt_layer; - } + min_array_element = irb ? irb-mt_layer : 0; lod = irb ? irb-mt_level - irb-mt-first_level : 0; -- 1.9.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 0/4] Fixes for layered/msaa/view interactions
Here's a bunch of fixes for issues I noticed in color renderbuffer and depthbuffer setup while reviewing Ken's patch to enable views on Gen8. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/4] i965/Gen8: Set up layer constraints properly for renderbuffers
Fixing the same issues the previous commit does for Gen7. Note that I can't test this one, since I don't have a Broadwell. Signed-off-by: Chris Forbes chr...@ijw.co.nz --- src/mesa/drivers/dri/i965/gen8_surface_state.c | 15 +-- 1 file changed, 5 insertions(+), 10 deletions(-) diff --git a/src/mesa/drivers/dri/i965/gen8_surface_state.c b/src/mesa/drivers/dri/i965/gen8_surface_state.c index 564d275..6fd0ad4 100644 --- a/src/mesa/drivers/dri/i965/gen8_surface_state.c +++ b/src/mesa/drivers/dri/i965/gen8_surface_state.c @@ -275,8 +275,8 @@ gen8_update_renderbuffer_surface(struct brw_context *brw, uint32_t format = 0; uint32_t surf_type; bool is_array = false; - int depth = MAX2(rb-Depth, 1); - int min_array_element; + int depth = irb-layer_count; + int min_array_element = irb-mt_layer / MAX2(mt-num_samples, 1); GLenum gl_target = rb-TexImage ? rb-TexImage-TexObject-Target : GL_TEXTURE_2D; @@ -296,20 +296,15 @@ gen8_update_renderbuffer_surface(struct brw_context *brw, is_array = true; depth *= 6; break; + case GL_TEXTURE_3D: + depth = rb-Depth; + /* fallthrough */ default: surf_type = translate_tex_target(gl_target); is_array = _mesa_tex_target_is_array(gl_target); break; } - if (layered) { - min_array_element = 0; - } else if (mt-num_samples 1) { - min_array_element = irb-mt_layer / mt-num_samples; - } else { - min_array_element = irb-mt_layer; - } - /* _NEW_BUFFERS */ mesa_format rb_format = _mesa_get_render_format(ctx, intel_rb_format(irb)); assert(brw_render_target_supported(brw, rb)); -- 1.9.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/4] i965/Gen7: Set up layer constraints properly for depth buffers
Again, a few problems: - Layered attachments did not honor MinLayer. - Non-layered MSAA attachments rendered to the wrong layer due to dividing by the layer count. All depth buffers use the IMS layout, so the physical layer count == logical layer count. - Layered attachments were not limited to irb-layer_count, so we could render off the end of the texture. Signed-off-by: Chris Forbes chr...@ijw.co.nz --- src/mesa/drivers/dri/i965/gen7_misc_state.c | 15 ++- 1 file changed, 6 insertions(+), 9 deletions(-) diff --git a/src/mesa/drivers/dri/i965/gen7_misc_state.c b/src/mesa/drivers/dri/i965/gen7_misc_state.c index b6759f1..593a042 100644 --- a/src/mesa/drivers/dri/i965/gen7_misc_state.c +++ b/src/mesa/drivers/dri/i965/gen7_misc_state.c @@ -65,7 +65,7 @@ gen7_emit_depth_stencil_hiz(struct brw_context *brw, rb = (struct gl_renderbuffer*) irb; if (rb) { - depth = MAX2(rb-Depth, 1); + depth = irb-layer_count; if (rb-TexImage) gl_target = rb-TexImage-TexObject-Target; } @@ -81,19 +81,16 @@ gen7_emit_depth_stencil_hiz(struct brw_context *brw, surftype = BRW_SURFACE_2D; depth *= 6; break; + case GL_TEXTURE_3D: + assert(rb); + depth = rb-Depth; + /* fallthrough */ default: surftype = translate_tex_target(gl_target); break; } - if (fb-MaxNumLayers 0 || !irb) { - min_array_element = 0; - } else if (irb-mt-num_samples 1) { - /* Convert physical layer to logical layer. */ - min_array_element = irb-mt_layer / irb-mt-num_samples; - } else { - min_array_element = irb-mt_layer; - } + min_array_element = irb ? irb-mt_layer : 0; lod = irb ? irb-mt_level - irb-mt-first_level : 0; -- 1.9.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev