[Mesa-dev] [Bug 55998] Pretty huge slowdown in mesa 9.0
https://bugs.freedesktop.org/show_bug.cgi?id=55998 Kenneth Graunke kenn...@whitecape.org changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |NOTOURBUG --- Comment #26 from Kenneth Graunke kenn...@whitecape.org --- (In reply to comment #25) I've pushed a patch to KWin that should fix the problem. The patch is in both master and the 4.9 branch, so you should see it in 4.9.3. That release will be tagged on Thursday. Thanks Fredrik! Based on that, I'm closing this as RESOLVED/NOTOURBUG. @Rune: The GLX backend in KWin uses glXGetFBConfigs(), so the ordering rules in the spec don't apply here. I've thought about rewriting that code to use glXChooseFBConfig(), but I'm reluctant to touch working code without a good reason. This bug might be enough of a reason to do that though. @Rune: Feel free to double check by running glxinfo, but I believe the ordering is correct: multisample configs are always sorted later, as required. Using glXChooseFBConfig() does seem like a good idea...could guard against future problems (though I don't know what), and should simplify the code a fair bit. Also, KWin's EGL backend uses eglChooseConfig() and it never had this problem. That would also make the GLX and EGL backends more similar. Thanks again. -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 56605] New: [regression - build error] /usr/bin/ld: ../../auxiliary//libgallium.a(u_dl.o): undefined reference to symbol 'dlopen@@GLIBC_2.1'
https://bugs.freedesktop.org/show_bug.cgi?id=56605 Priority: medium Bug ID: 56605 Assignee: mesa-dev@lists.freedesktop.org Summary: [regression - build error] /usr/bin/ld: ../../auxiliary//libgallium.a(u_dl.o): undefined reference to symbol 'dlopen@@GLIBC_2.1' Severity: normal Classification: Unclassified OS: Linux (All) Reporter: fabio@libero.it Hardware: All Status: NEW Version: git Component: Mesa core Product: Mesa Something between 0a66ced8f822b0d5478b0cd6d72c1a6ad70647a2 and 183e122bdfe27f875c3c121964484dae9587c051 broke build in Ubuntu 12.04. Error: g++ -Wl,-Bsymbolic-functions -Wl,-z,relro -L/usr/lib/llvm-3.1/lib -lpthread -lffi -ldl -lm lp_test_blend.o lp_test_main.o -o lp_test_blend -Wl,--start-group -L../../auxiliary/ -lgallium libllvmpipe.a -lLLVM-3.1 -lXext -lXdamage -lXfixes -lX11-xcb -lX11 -lxcb-glx -lxcb-dri2 -lxcb -lXxf86vm -ldrm -lm -lpthread -ldl -Wl,--end-group /usr/bin/ld: ../../auxiliary//libgallium.a(u_dl.o): undefined reference to symbol 'dlopen@@GLIBC_2.1' /usr/bin/ld: note: 'dlopen@@GLIBC_2.1' is defined in DSO /usr/lib/gcc/i686-linux-gnu/4.6/../../../i386-linux-gnu/libdl.so so try adding it to the linker command line /usr/lib/gcc/i686-linux-gnu/4.6/../../../i386-linux-gnu/libdl.so: could not read symbols: Invalid operation Full build log: https://launchpadlibrarian.net/121714750/buildlog_ubuntu-precise-i386.mesa_9.1~git1210310858.183e12~gd~p_FAILEDTOBUILD.txt.gz -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 56605] [build error when building without --enable-debug] /usr/bin/ld: ../../auxiliary//libgallium.a(u_dl.o): undefined reference to symbol 'dlopen@@GLIBC_2.1'
https://bugs.freedesktop.org/show_bug.cgi?id=56605 Fabio Pedretti fabio@libero.it changed: What|Removed |Added Summary|[regression - build error] |[build error when building |/usr/bin/ld:|without --enable-debug] |../../auxiliary//libgallium |/usr/bin/ld: |.a(u_dl.o): undefined |../../auxiliary//libgallium |reference to symbol |.a(u_dl.o): undefined |'dlopen@@GLIBC_2.1' |reference to symbol ||'dlopen@@GLIBC_2.1' --- Comment #1 from Fabio Pedretti fabio@libero.it --- Note: this only happens when building without --enable-debug and it's not a recent regression as hinted in previous post. -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] R600 tiling halves the frame rate
On Tue, Oct 30, 2012 at 8:49 PM, Tzvetan Mikov tmi...@jupiter.com wrote: On 10/30/2012 05:20 PM, Tzvetan Mikov wrote: Thanks a lot! I reproduced the same results here and I think I have figured out what the problem is. The frame buffer is always created in linear mode. The temporary hack included below doubles the performance for me with EGL. Could you please check if it has the same result for you? If it does, what would be the next step to address this? I guess I could try to prepare a real patch to fix this, as soon as I figure the right way to do it... :-) I am new to Mesa, but I am making my way through the code base. regards, Tzvetan commit 10bb3497caba1655022a53a3a04c81be6e122faa Author: Tzvetan Mikov tmi...@jupiter.com Date: Tue Oct 30 17:12:42 2012 -0700 r600_texture.c: HACK to enforce tiling in the default case diff --git a/src/gallium/drivers/r600/r600_texture.c b/src/gallium/drivers/r600/r600_texture.c index 85e4e0c..f415de3 100644 --- a/src/gallium/drivers/r600/r600_texture.c +++ b/src/gallium/drivers/r600/r600_texture.c @@ -450,7 +450,7 @@ struct pipe_resource *r600_texture_create(struct pipe_screen *screen, { struct r600_screen *rscreen = (struct r600_screen*)screen; struct radeon_surface surface; -unsigned array_mode = 0; +unsigned array_mode = V_038000_ARRAY_1D_TILED_THIN1; int r; if (!(templ-flags R600_RESOURCE_FLAG_TRANSFER)) { I just noticed that with this hack the display doesn't look quite right, so while it hopefully points in the right direction, the real fix is likely to be much more involved. My enthusiasm may have been premature :-) regards, Tzvetan For it to look right we need mesa to call into the kernel to tell the kernel what is the bo tiling format. We should do that for scanout buffer. This will fix your issue and you probably want 2d tiled not 1d for scanout. Cheers, Jerome ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] vbo: fix glVertexAttribI* functions
On 10/30/2012 04:32 PM, Marek Olšák wrote: The functions were broken, because they converted ints to floats. Now we can finally advertise OpenGL 3.0. ;) In this commit, the vbo module also tracks the type for each attrib in addition to the size. It can be one of FLOAT, INT, UNSIGNED_INT. The little ugliness is the vertex attribs are declared as floats even though there may be integer values. The code just copies integer values into them without any conversion. This implementation passes the glVertexAttribI piglit test which I am going to commit in piglit soon. The test covers vertex arrays, immediate mode and display lists. NOTE: This is likely a candidate for the stable branches. Looks good. Just some minor things below. Reviewed-by: Brian Paul bri...@vmware.com --- docs/GL3.txt |3 +- src/mesa/main/macros.h| 44 + src/mesa/vbo/vbo_attrib_tmp.h | 86 ++--- src/mesa/vbo/vbo_context.h| 42 src/mesa/vbo/vbo_exec.h |1 + src/mesa/vbo/vbo_exec_api.c | 29 +- src/mesa/vbo/vbo_exec_draw.c |7 ++-- src/mesa/vbo/vbo_save.h |2 + src/mesa/vbo/vbo_save_api.c | 12 -- src/mesa/vbo/vbo_save_draw.c | 21 +++--- 10 files changed, 183 insertions(+), 64 deletions(-) diff --git a/docs/GL3.txt b/docs/GL3.txt index 4f44764..28f6ae6 100644 --- a/docs/GL3.txt +++ b/docs/GL3.txt @@ -34,8 +34,7 @@ sRGB framebuffer format (GL_EXT_framebuffer_sRGB) DONE (i965, r600) glClearBuffer commandsDONE glGetStringi command DONE glTexParameterI, glGetTexParameterI commands DONE -glVertexAttribI commands ~50% done (converts int - values to floats) +glVertexAttribI commands DONE Depth format cube texturesDONE GLX_ARB_create_context (GLX 1.4 is required) DONE diff --git a/src/mesa/main/macros.h b/src/mesa/main/macros.h index 7b7fd1b..f89533f 100644 --- a/src/mesa/main/macros.h +++ b/src/mesa/main/macros.h @@ -171,6 +171,26 @@ extern GLfloat _mesa_ubyte_to_float_color_tab[256]; ub = ((GLubyte) F_TO_I((f) * 255.0F)) #endif +static inline float INT_AS_FLT(int i) +{ + union { + int i; + float f; + } tmp; + tmp.i = i; + return tmp.f; +} We have an fi_type union in imports.h, so: static inline float INT_AS_FLT(int i) { fi_type tmp; tmp.i = i; return tmp.f; } + +static inline float UINT_AS_FLT(unsigned u) +{ + union { + unsigned u; + float f; + } tmp; + tmp.u = u; + return tmp.f; +} You could add an unsigned field to the fi_type union and use it here too, if you want. + /*@}*/ @@ -573,6 +593,30 @@ do { \ /*@}*/ +/** Copy \p sz elements into a homegeneous (4-element) vector, giving + * default values to the remaining components. + * The default values are chosen based on \p type. */ Closing */ on it's own line, please. +static inline void +COPY_CLEAN_4V_TYPE_AS_FLOAT(GLfloat dst[4], int sz, const GLfloat src[4], +GLenum type) +{ + switch (type) { + case GL_FLOAT: + ASSIGN_4V(dst, 0, 0, 0, 1); + break; + case GL_INT: + ASSIGN_4V(dst, INT_AS_FLT(0), INT_AS_FLT(0), + INT_AS_FLT(0), INT_AS_FLT(1)); + break; + case GL_UNSIGNED_INT: + ASSIGN_4V(dst, UINT_AS_FLT(0), UINT_AS_FLT(0), + UINT_AS_FLT(0), UINT_AS_FLT(1)); + break; + default: + ASSERT(0); + } + COPY_SZ_4V(dst, sz, src); +} /** \name Linear interpolation functions */ /*@{*/ diff --git a/src/mesa/vbo/vbo_attrib_tmp.h b/src/mesa/vbo/vbo_attrib_tmp.h index 8848445..6bc53ba 100644 --- a/src/mesa/vbo/vbo_attrib_tmp.h +++ b/src/mesa/vbo/vbo_attrib_tmp.h @@ -26,38 +26,46 @@ USE OR OTHER DEALINGS IN THE SOFTWARE. **/ /* float */ -#define ATTR1FV( A, V ) ATTR( A, 1, (V)[0], 0, 0, 1 ) -#define ATTR2FV( A, V ) ATTR( A, 2, (V)[0], (V)[1], 0, 1 ) -#define ATTR3FV( A, V ) ATTR( A, 3, (V)[0], (V)[1], (V)[2], 1 ) -#define ATTR4FV( A, V ) ATTR( A, 4, (V)[0], (V)[1], (V)[2], (V)[3] ) +#define ATTR1FV( A, V ) ATTR( A, 1, GL_FLOAT, (V)[0], 0, 0, 1 ) +#define ATTR2FV( A, V ) ATTR( A, 2, GL_FLOAT, (V)[0], (V)[1], 0, 1 ) +#define ATTR3FV( A, V ) ATTR( A, 3, GL_FLOAT, (V)[0], (V)[1], (V)[2], 1 ) +#define ATTR4FV( A, V ) ATTR( A, 4, GL_FLOAT, (V)[0], (V)[1], (V)[2], (V)[3] ) -#define ATTR1F( A, X ) ATTR( A, 1, X, 0, 0, 1 ) -#define ATTR2F( A, X, Y ) ATTR( A, 2, X, Y, 0, 1 ) -#define ATTR3F( A, X, Y, Z )ATTR( A, 3, X, Y, Z, 1 ) -#define ATTR4F( A, X, Y, Z, W ) ATTR( A, 4, X, Y, Z, W ) +#define ATTR1F( A, X ) ATTR( A, 1, GL_FLOAT, X, 0, 0, 1 ) +#define
[Mesa-dev] [PATCH] nv50,nvc0: expose ARB_map_buffer_alignment
All HW buffers (also suballocated ones) are already aligned. Just make sure that also the initial sysram buffers have proper alignment. --- Passes the ARB_map_buffer_alignment piglit test on nv50. Not tested on nvc0. --- src/gallium/drivers/nouveau/nouveau_buffer.c | 6 +++--- src/gallium/drivers/nouveau/nouveau_mm.c | 2 +- src/gallium/drivers/nv50/nv50_screen.c | 3 ++- src/gallium/drivers/nvc0/nvc0_screen.c | 3 ++- 4 Dateien geändert, 8 Zeilen hinzugefügt(+), 6 Zeilen entfernt(-) diff --git a/src/gallium/drivers/nouveau/nouveau_buffer.c b/src/gallium/drivers/nouveau/nouveau_buffer.c index fb929d6..0ecd53a 100644 --- a/src/gallium/drivers/nouveau/nouveau_buffer.c +++ b/src/gallium/drivers/nouveau/nouveau_buffer.c @@ -43,7 +43,7 @@ nouveau_buffer_allocate(struct nouveau_screen *screen, } if (domain != NOUVEAU_BO_GART) { if (!buf-data) { - buf-data = MALLOC(buf-base.width0); + buf-data = align_malloc(buf-base.width0, 64); if (!buf-data) return FALSE; } @@ -92,7 +92,7 @@ nouveau_buffer_destroy(struct pipe_screen *pscreen, nouveau_buffer_release_gpu_storage(res); if (res-data !(res-status NOUVEAU_BUFFER_STATUS_USER_MEMORY)) - FREE(res-data); + align_free(res-data); nouveau_fence_ref(NULL, res-fence); nouveau_fence_ref(NULL, res-fence_wr); @@ -457,7 +457,7 @@ nouveau_buffer_migrate(struct nouveau_context *nv, if (ret) return ret; memcpy((uint8_t *)buf-bo-map + buf-offset, buf-data, size); - FREE(buf-data); + align_free(buf-data); } else if (old_domain != 0 new_domain != 0) { struct nouveau_mm_allocation *mm = buf-mm; diff --git a/src/gallium/drivers/nouveau/nouveau_mm.c b/src/gallium/drivers/nouveau/nouveau_mm.c index 4207084..6045af6 100644 --- a/src/gallium/drivers/nouveau/nouveau_mm.c +++ b/src/gallium/drivers/nouveau/nouveau_mm.c @@ -9,7 +9,7 @@ #include nouveau_screen.h #include nouveau_mm.h -#define MM_MIN_ORDER 7 +#define MM_MIN_ORDER 7 /* = 6 to not violate ARB_map_buffer_alignment */ #define MM_MAX_ORDER 20 #define MM_NUM_BUCKETS (MM_MAX_ORDER - MM_MIN_ORDER + 1) diff --git a/src/gallium/drivers/nv50/nv50_screen.c b/src/gallium/drivers/nv50/nv50_screen.c index 9461af9..d0a0295 100644 --- a/src/gallium/drivers/nv50/nv50_screen.c +++ b/src/gallium/drivers/nv50/nv50_screen.c @@ -170,11 +170,12 @@ nv50_screen_get_param(struct pipe_screen *pscreen, enum pipe_cap param) return 1; case PIPE_CAP_CONSTANT_BUFFER_OFFSET_ALIGNMENT: return 256; + case PIPE_CAP_MIN_MAP_BUFFER_ALIGNMENT: + return 64; case PIPE_CAP_VERTEX_BUFFER_OFFSET_4BYTE_ALIGNED_ONLY: case PIPE_CAP_VERTEX_BUFFER_STRIDE_4BYTE_ALIGNED_ONLY: case PIPE_CAP_VERTEX_ELEMENT_SRC_OFFSET_4BYTE_ALIGNED_ONLY: case PIPE_CAP_TEXTURE_MULTISAMPLE: - case PIPE_CAP_MIN_MAP_BUFFER_ALIGNMENT: return 0; default: NOUVEAU_ERR(unknown PIPE_CAP %d\n, param); diff --git a/src/gallium/drivers/nvc0/nvc0_screen.c b/src/gallium/drivers/nvc0/nvc0_screen.c index 0e0b666..3bf2191 100644 --- a/src/gallium/drivers/nvc0/nvc0_screen.c +++ b/src/gallium/drivers/nvc0/nvc0_screen.c @@ -148,11 +148,12 @@ nvc0_screen_get_param(struct pipe_screen *pscreen, enum pipe_cap param) return 1; case PIPE_CAP_CONSTANT_BUFFER_OFFSET_ALIGNMENT: return 256; + case PIPE_CAP_MIN_MAP_BUFFER_ALIGNMENT: + return 64; case PIPE_CAP_VERTEX_BUFFER_OFFSET_4BYTE_ALIGNED_ONLY: case PIPE_CAP_VERTEX_BUFFER_STRIDE_4BYTE_ALIGNED_ONLY: case PIPE_CAP_VERTEX_ELEMENT_SRC_OFFSET_4BYTE_ALIGNED_ONLY: case PIPE_CAP_TEXTURE_MULTISAMPLE: - case PIPE_CAP_MIN_MAP_BUFFER_ALIGNMENT: return 0; default: NOUVEAU_ERR(unknown PIPE_CAP %d\n, param); -- 1.7.11.7 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] AMDGPU: Don't allow using SI SGPRs 102 and 103 directly.
From: Michel Dänzer michel.daen...@amd.com Two SGPRs are used for VCC, so it's not possible to use these and VCC together. Signed-off-by: Michel Dänzer michel.daen...@amd.com --- lib/Target/AMDGPU/SIRegisterInfo.td |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/lib/Target/AMDGPU/SIRegisterInfo.td b/lib/Target/AMDGPU/SIRegisterInfo.td index a3d91ae..e52311a 100644 --- a/lib/Target/AMDGPU/SIRegisterInfo.td +++ b/lib/Target/AMDGPU/SIRegisterInfo.td @@ -65,12 +65,12 @@ def SAMPLE_COVERAGE : SIReg SAMPLE_COVERAGE; def POS_FIXED_PT : SIReg POS_FIXED_PT; // SGPR 32-bit registers -foreach Index = 0-103 in { +foreach Index = 0-101 in { def SGPR#Index : SGPR_32 Index, SGPR#Index; } def SGPR_32 : RegisterClassAMDGPU, [f32, i32], 32, -(add (sequence SGPR%u, 0, 103)); +(add (sequence SGPR%u, 0, 101)); // SGPR 64-bit registers def SGPR_64 : RegisterTuples[low, high], -- 1.7.10.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] AMDGPU: Only allow SGPR for the first operand of SI VOP3 instructions.
On Tue, Oct 30, 2012 at 12:48 PM, Michel Dänzer mic...@daenzer.net wrote: From: Michel Dänzer michel.daen...@amd.com This is technically too strict: While a VOP3 instruction can only use one SGPR, it can be used for any operand, even for several operands at the same. But for now this is a simple solution which fixes the problem (e.g. causing broken linear fog with radeonsi) at little extra cost (in the form of V_MOV_* from SGPR to VGPR). Signed-off-by: Michel Dänzer michel.daen...@amd.com Looks good to me until we figure out a better plan for dealing with VGPRs vs. SPGRs. Reviewed-by: Alex Deucher alexander.deuc...@amd.com --- lib/Target/AMDGPU/SIInstrFormats.td |4 ++-- lib/Target/AMDGPU/SIInstructions.td |4 ++-- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/lib/Target/AMDGPU/SIInstrFormats.td b/lib/Target/AMDGPU/SIInstrFormats.td index 97d54ac..aea3b5a 100644 --- a/lib/Target/AMDGPU/SIInstrFormats.td +++ b/lib/Target/AMDGPU/SIInstrFormats.td @@ -35,10 +35,10 @@ class VOP3_1_32 bits9 op, string opName, listdag pattern : VOP3b_2IN op, opName, SReg_1, AllReg_32, VReg_32, pattern; class VOP3_32 bits9 op, string opName, listdag pattern - : VOP3 op, (outs VReg_32:$dst), (ins AllReg_32:$src0, AllReg_32:$src1, AllReg_32:$src2, i32imm:$src3, i32imm:$src4, i32imm:$src5, i32imm:$src6), opName, pattern; + : VOP3 op, (outs VReg_32:$dst), (ins AllReg_32:$src0, VReg_32:$src1, VReg_32:$src2, i32imm:$src3, i32imm:$src4, i32imm:$src5, i32imm:$src6), opName, pattern; class VOP3_64 bits9 op, string opName, listdag pattern - : VOP3 op, (outs VReg_64:$dst), (ins AllReg_64:$src0, AllReg_64:$src1, AllReg_64:$src2, i32imm:$src3, i32imm:$src4, i32imm:$src5, i32imm:$src6), opName, pattern; + : VOP3 op, (outs VReg_64:$dst), (ins AllReg_64:$src0, VReg_64:$src1, VReg_64:$src2, i32imm:$src3, i32imm:$src4, i32imm:$src5, i32imm:$src6), opName, pattern; class SOP1_32 bits8 op, string opName, listdag pattern diff --git a/lib/Target/AMDGPU/SIInstructions.td b/lib/Target/AMDGPU/SIInstructions.td index cb94381..bdac6a4 100644 --- a/lib/Target/AMDGPU/SIInstructions.td +++ b/lib/Target/AMDGPU/SIInstructions.td @@ -1249,8 +1249,8 @@ def : Pat /** VOP3 Patterns**/ /** == **/ -def : Pat (f32 (IL_mad AllReg_32:$src0, AllReg_32:$src1, AllReg_32:$src2)), - (V_MAD_LEGACY_F32 AllReg_32:$src0, AllReg_32:$src1, AllReg_32:$src2, +def : Pat (f32 (IL_mad AllReg_32:$src0, VReg_32:$src1, VReg_32:$src2)), + (V_MAD_LEGACY_F32 AllReg_32:$src0, VReg_32:$src1, VReg_32:$src2, 0, 0, 0, 0); } // End isSI predicate -- 1.7.10.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] AMDGPU: Don't allow using SI SGPRs 102 and 103 directly.
On Wed, Oct 31, 2012 at 05:00:04PM +0100, Michel Dänzer wrote: From: Michel Dänzer michel.daen...@amd.com Two SGPRs are used for VCC, so it's not possible to use these and VCC together. Reviewed-by: Tom Stellard thomas.stel...@amd.com Signed-off-by: Michel Dänzer michel.daen...@amd.com --- lib/Target/AMDGPU/SIRegisterInfo.td |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/lib/Target/AMDGPU/SIRegisterInfo.td b/lib/Target/AMDGPU/SIRegisterInfo.td index a3d91ae..e52311a 100644 --- a/lib/Target/AMDGPU/SIRegisterInfo.td +++ b/lib/Target/AMDGPU/SIRegisterInfo.td @@ -65,12 +65,12 @@ def SAMPLE_COVERAGE : SIReg SAMPLE_COVERAGE; def POS_FIXED_PT : SIReg POS_FIXED_PT; // SGPR 32-bit registers -foreach Index = 0-103 in { +foreach Index = 0-101 in { def SGPR#Index : SGPR_32 Index, SGPR#Index; } def SGPR_32 : RegisterClassAMDGPU, [f32, i32], 32, -(add (sequence SGPR%u, 0, 103)); +(add (sequence SGPR%u, 0, 101)); // SGPR 64-bit registers def SGPR_64 : RegisterTuples[low, high], -- 1.7.10.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/5] mesa: Import a copy of the open-addressing hash table code I wrote.
I look forward to seeing this improved hash table land. Great that it includes tests. I failed to find a test that artificially constructed collisions and verified that they were handled correctly. Maybe I'll submit a test for that. Comments below. Everything looked correct, so my review mostly asks for an occasional clarifying comment. On 10/25/2012 09:13 AM, Eric Anholt wrote: Mesa's chaining hash table for object names is slow, and this should be much faster. I namespaced the functions under _mesa_*, to avoid visibility troubles that we may have had before with hash_table_* functions. The hash_table.c file unfortunately lives in program/ still to avoid confusion with automake's dependency files that would otherwise require a make distclean across this change. The end result is absurd. There is a .c/.h pair in a directory that don't match up. It looks like this: program/hash_table.c implements main/hash_table.h program/hash_table.h is header for program/chaining_hash_table.c diff --git a/src/mesa/program/hash_table.c b/src/mesa/program/hash_table.c new file mode 100644 index 000..ba49437 --- /dev/null +++ b/src/mesa/program/hash_table.c @@ -0,0 +1,431 @@ +/* + * Copyright © 2009,2012 Intel Corporation + +/** + * Implements an open-addressing, linear-reprobing hash table. + * + * For more information, see: + * + * http://cgit.freedesktop.org/~anholt/hash_table/tree/README + */ + +#include stdlib.h +#include string.h + +#include main/hash_table.h +#include ralloc.h + +#define ARRAY_SIZE(array) (sizeof(array) / sizeof(array[0])) + Static please. +uint32_t deleted_key_value; +/** + * Finds a hash table entry with the given key and hash of that key. + * + * Returns NULL if no entry is found. Note that the data pointer may be + * modified by the user. + */ +struct hash_entry * +_mesa_hash_table_search(struct hash_table *ht, uint32_t hash, +const void *key) +{ + uint32_t hash_address; + + hash_address = hash % ht-size; + do { + uint32_t double_hash; + + struct hash_entry *entry = ht-table + hash_address; + + if (entry_is_free(entry)) { + return NULL; + } else if (entry_is_present(ht, entry) entry-hash == hash) { + if (ht-key_equals_function(key, entry-key)) { +return entry; + } + } + + double_hash = 1 + hash % ht-rehash; + + hash_address = (hash_address + double_hash) % ht-size; The while condition looks mystic. A comment here would be nice explaining that we break the loop because we've cycled around to the first probed address. Or simply a self-documenting variable name would suffice. + } while (hash_address != hash % ht-size); + + return NULL; +} +/** + * Inserts the key with the given hash into the table. + * + * Note that insertion may rearrange the table on a resize or rehash, + * so previously found hash_entries are no longer valid after this function. + */ +struct hash_entry * +_mesa_hash_table_insert(struct hash_table *ht, uint32_t hash, +const void *key, void *data) +{ + uint32_t hash_address; + + if (ht-entries = ht-max_entries) { + _mesa_hash_table_rehash(ht, ht-size_index + 1); + } else if (ht-deleted_entries + ht-entries = ht-max_entries) { + _mesa_hash_table_rehash(ht, ht-size_index); + } + + hash_address = hash % ht-size; + do { + struct hash_entry *entry = ht-table + hash_address; + uint32_t double_hash; + + if (!entry_is_present(ht, entry)) { + if (entry_is_deleted(ht, entry)) +ht-deleted_entries--; + entry-hash = hash; + entry-key = key; + entry-data = data; + ht-entries++; + return entry; + } + + /* Implement replacement when another insert happens + * with a matching key. This is a relatively common + * feature of hash tables, with the alternative + * generally being insert the new value as well, and + * return it first when the key is searched for. + * + * Note that the hash table doesn't have a delete + * callback. If freeing of old data pointers is + * required to avoid memory leaks, perform a search + * before inserting. + */ + if (entry-hash == hash + ht-key_equals_function(key, entry-key)) { + entry-key = key; + entry-data = data; + return entry; + } + + + double_hash = 1 + hash % ht-rehash; + + hash_address = (hash_address + double_hash) % ht-size; Ditto as for _mesa_hash_table_search(). + } while (hash_address != hash % ht-size); + + /* We could hit here if a required resize failed. An unchecked-malloc +* application could ignore this result. +*/ + return NULL; +} Please document here that 'predicate' is ignored if null. +struct hash_entry *
Re: [Mesa-dev] GL 3.1 on Radeon HD 4670?
DOH. I'm sorry, I read that Mesa supported GL 3.1 and somehow I generalized that to all drivers. Thanks for that TODO list. I guess I need to start reading about the R700 architecture... Patrick On Wed, Oct 31, 2012 at 1:28 PM, Alex Deucher alexdeuc...@gmail.com wrote: On Wed, Oct 31, 2012 at 1:11 PM, Patrick Baggett baggett.patr...@gmail.com wrote: Hi all, I've got a really weird duck of system: an Itanium2 system running Linux 3.7.0-rc3 with the newest libdrm and mesa git from yesterday. I configured it with --enable-texture-float and the radeon DRI driver. When I use glxinfo, I see that it is Mesa 9.1-devel but only OpenGL 3.0. Is that because my version glxinfo doesn't create the appropriate context? Is there an updated version of glxinfo that does? Or a flag that I should pass to only consider core contexts? The open source r600g driver only supports GL 3.0 at the moment. See this document to see what's still missing: http://cgit.freedesktop.org/mesa/mesa/tree/docs/GL3.txt Alex ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] mesa/vbo: Fix scaling issue in 10-bit signed normalized packing.
On 10/15/2012 09:45 AM, Ian Romanick wrote: On 10/14/2012 12:02 PM, Kenneth Graunke wrote: For the 10-bit components, the divisor was incorrect. A 10-bit signed integer can represent -2^9 through 2^9 - 1, which leads to the following ranges: (float)value.x - [ -512, 511] 2.0F * (float)value.x - [-1024, 1022] 2.0F * (float)value.x + 1.0F - [-1023, 1023] So dividing by 511 would incorrectly scale it to approximately: [-2.001956947, 2.001956947]. To correctly scale to [-1.0, 1.0], we need to divide by 1023. This is a very annoying part of the desktop GL specification: there are two different ways to convert 10-bit and 2-bit normalized integers to float. GLES3 fixes this, I believe, by having only one conversion. Can you double-check these changes against the GLES3 rules? Sigh. You're right...this code implements equation 2.2 f = (2c + 1)/(2^b - 1).(2.2) which is mandatory and correct for desktop OpenGL 4.1 and earlier. Starting with GL 4.2 and GLES 3, the spec changed to instead require equation 2.3: f = max{c/(2^(b-1) - 1), -1.0} (2.3) It also explicitly marks this as a change in behavior. So we'll need to conditionalize it based on the current context API and version, I guess. Ugh. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] configure.ac: Prevent build of radeon llvm backend with llvm 3.2
--- configure.ac | 14 +- 1 file changed, 9 insertions(+), 5 deletions(-) diff --git a/configure.ac b/configure.ac index 6b97a26..b916b38 100644 --- a/configure.ac +++ b/configure.ac @@ -1748,15 +1748,19 @@ gallium_require_drm_loader() { } radeon_llvm_check() { -LLVM_VERSION_MAJOR=`echo $LLVM_VERSION | cut -d. -f1` -if test $LLVM_VERSION_MAJOR -lt 3 -o x$LLVM_VERSION = x3.0; then -AC_MSG_ERROR([LLVM 3.1 or newer is required for the r600/radeonsi llvm compiler.]) +LLVM_REQUIRED_VERSION_MAJOR=3 +LLVM_REQUIRED_VERSION_MINOR=2 +LLVM_AVAILABLE_VERSION_MAJOR=`echo $LLVM_VERSION | cut -d. -f1` +LLVM_AVAILABLE_VERSION_MINOR=`echo $LLVM_VERSION | cut -d. -f2` +if test $LLVM_AVAILABLE_VERSION_MAJOR -lt $LLVM_REQUIRED_VERSION_MAJOR -o $LLVM_AVAILABLE_VERSION_MINOR -lt $LLVM_REQUIRED_VERSION_MINOR; then +AC_MSG_ERROR([LLVM $LLVM_REQUIRED_VERSION_MAJOR.$LLVM_REQUIRED_VERSION_MINOR or newer is required for the r600/radeonsi llvm compiler.]) fi -if test $LLVM_VERSION_MAJOR -ge 3 -a x$LLVM_VERSION != x3.1 $LLVM_CONFIG --targets-built | grep -qv '\AMDGPU\' ; then -AC_MSG_ERROR([To use the r600/radeonsi LLVM backend with LLVM 3.2 and newer, you need to fetch the LLVM source from: +if test true $LLVM_CONFIG --targets-built | grep -qv '\AMDGPU\' ; then +AC_MSG_ERROR([To use the r600/radeonsi LLVM backend, you need to fetch the LLVM source from: git://people.freedesktop.org/~tstellar/llvm master and build with --enable-experimental-targets=AMDGPU]) fi +AC_MSG_WARN([Please ensure you use the latest llvm tree from git://people.freedesktop.org/~tstellar/llvm master before submitting a bug]) if test x$LLVM_VERSION = x3.2; then LLVM_LIBS=$LLVM_LIBS `$LLVM_CONFIG --libs amdgpu` fi -- 1.7.11.7 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/3] i965: Set dirty state for brw_draw_upload.c when num_instances changes.
Otherwise, if we had a set of prims passed in with a num_instances varying between them, we wouldn't upload enough (or too much!) from user vertex arrays. --- src/mesa/drivers/dri/i965/brw_draw.c |5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_draw.c b/src/mesa/drivers/dri/i965/brw_draw.c index 1cfba29..22d18f9 100644 --- a/src/mesa/drivers/dri/i965/brw_draw.c +++ b/src/mesa/drivers/dri/i965/brw_draw.c @@ -474,7 +474,10 @@ static bool brw_try_draw_prims( struct gl_context *ctx, intel_batchbuffer_require_space(intel, estimated_max_prim_size, false); intel_batchbuffer_save_state(intel); - brw-num_instances = prim-num_instances; + if (brw-num_instances != prim-num_instances) { + brw-num_instances = prim-num_instances; + brw-state.dirty.brw |= BRW_NEW_VERTICES; + } if (intel-gen 6) brw_set_prim(brw, prim[i]); else -- 1.7.10.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] Fix draw_elements_base_vertex with user vertex arrays
Note that this patch series sits on top of Ken's stride == 0 changes, otherwise the piglit tests trip the assert. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/3] i965: Remove the vbo_rebase_prims() path.
The brw_draw_upload.c start_vertex_bias code has support for doing the rebase without rewriting the index buffer by applying a basevertex. It looks like vbo_rebase_prims() is not equipped to handle basevertex. --- src/mesa/drivers/dri/i965/brw_draw.c | 21 ++--- 1 file changed, 6 insertions(+), 15 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_draw.c b/src/mesa/drivers/dri/i965/brw_draw.c index 323310a..1cfba29 100644 --- a/src/mesa/drivers/dri/i965/brw_draw.c +++ b/src/mesa/drivers/dri/i965/brw_draw.c @@ -552,21 +552,12 @@ void brw_draw_prims( struct gl_context *ctx, return; } - if (!vbo_all_varyings_in_vbos(arrays)) { - if (!index_bounds_valid) -vbo_get_minmax_indices(ctx, prim, ib, min_index, max_index, nr_prims); - - /* Decide if we want to rebase. If so we end up recursing once - * only into this function. - */ - if (min_index != 0 !vbo_any_varyings_in_vbos(arrays)) { -vbo_rebase_prims(ctx, arrays, - prim, nr_prims, - ib, min_index, max_index, - brw_draw_prims ); -return; - } - } + /* If we're going to have to upload any of the user's vertex arrays, then +* get the minimum and maximum of their index buffer so we know what range +* to upload. +*/ + if (!vbo_all_varyings_in_vbos(arrays) !index_bounds_valid) + vbo_get_minmax_indices(ctx, prim, ib, min_index, max_index, nr_prims); /* Do GL_SELECT and GL_FEEDBACK rendering using swrast, even though it * won't support all the extensions we support. -- 1.7.10.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/3] i965: Fix uploading user vertex arrays with basevertex set.
If the index buffer is full of values like 0 1 2 3, but basevertex is 4, we need to upload at least vertex data for elements 4 5 6 7. Whether we also upload 0 1 2 3 is a question of whether there are VBOs present or not -- see the code setting start_vertex_bias in brw_draw_upload.c. Fixes piglit draw-elements*base-vertex user_varrays --- src/mesa/drivers/dri/i965/brw_context.h |1 + src/mesa/drivers/dri/i965/brw_draw.c|4 src/mesa/drivers/dri/i965/brw_draw_upload.c |4 ++-- 3 files changed, 7 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h index 19c6af7..195fd3d 100644 --- a/src/mesa/drivers/dri/i965/brw_context.h +++ b/src/mesa/drivers/dri/i965/brw_context.h @@ -1088,6 +1088,7 @@ struct brw_context } prim_restart; uint32_t num_instances; + int basevertex; }; diff --git a/src/mesa/drivers/dri/i965/brw_draw.c b/src/mesa/drivers/dri/i965/brw_draw.c index 22d18f9..97a1077 100644 --- a/src/mesa/drivers/dri/i965/brw_draw.c +++ b/src/mesa/drivers/dri/i965/brw_draw.c @@ -478,6 +478,10 @@ static bool brw_try_draw_prims( struct gl_context *ctx, brw-num_instances = prim-num_instances; brw-state.dirty.brw |= BRW_NEW_VERTICES; } + if (brw-basevertex != prim-basevertex) { + brw-basevertex = prim-basevertex; + brw-state.dirty.brw |= BRW_NEW_VERTICES; + } if (intel-gen 6) brw_set_prim(brw, prim[i]); else diff --git a/src/mesa/drivers/dri/i965/brw_draw_upload.c b/src/mesa/drivers/dri/i965/brw_draw_upload.c index ad7fe7c..51531ce 100644 --- a/src/mesa/drivers/dri/i965/brw_draw_upload.c +++ b/src/mesa/drivers/dri/i965/brw_draw_upload.c @@ -357,8 +357,8 @@ static void brw_prepare_vertices(struct brw_context *brw) GLbitfield64 vs_inputs = brw-vs.prog_data-inputs_read; const unsigned char *ptr = NULL; GLuint interleaved = 0; - unsigned int min_index = brw-vb.min_index; - unsigned int max_index = brw-vb.max_index; + unsigned int min_index = brw-vb.min_index + brw-basevertex; + unsigned int max_index = brw-vb.max_index + brw-basevertex; int delta, i, j; struct brw_vertex_element *upload[VERT_ATTRIB_MAX]; -- 1.7.10.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] R600 tiling halves the frame rate
On 10/31/2012 07:41 AM, Jerome Glisse wrote: For it to look right we need mesa to call into the kernel to tell the kernel what is the bo tiling format. We should do that for scanout buffer. This will fix your issue and you probably want 2d tiled not 1d for scanout. Anyway, since I do need this, I will figure out the details and post a patch here when I have something presentable. regards, Tzvetan ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/9] dispatch: Update check_table.cpp to reflect recent aliasing changes.
On 10/30/2012 10:42 AM, Paul Berry wrote: In commits bad96f6 and e7dd2e5 I added the following aliases: - ClampColor - ClampColorARB - VertexAttribDivisor - VertexAttribDivisorARB But I neglected to update check_table.cpp, causing make check to fail for non-shared-glapi builds. This patch removes the functions that are now aliased from check_table.cpp, so that make check works correctly again. --- src/mapi/glapi/tests/check_table.cpp | 2 -- 1 file changed, 2 deletions(-) Reviewed-by: Chad Versace chad.vers...@linux.intel.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/9] dispatch: Include glheader.h in dispatch-related files.
On 10/30/2012 10:42 AM, Paul Berry wrote: This ensures that GLES1-only typedefs are available in these files. In a future patch, this will allow us to expand the dispatch table to include GLES1-only functions. --- src/glx/tests/indirect_api.cpp | 2 +- src/mapi/glapi/gen/gl_gentable.py | 2 +- src/mapi/glapi/tests/check_table.cpp| 2 +- src/mapi/shared-glapi/tests/check_table.cpp | 2 +- 4 files changed, 4 insertions(+), 4 deletions(-) This change is simple enough. Reviewed-by: Chad Versace chad.vers...@linux.intel.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/9] dispatch: properly handle parameter name mismatches in glapitemp.h.
On 10/30/2012 10:42 AM, Paul Berry wrote: Previously, when code-generating aliased functions in glapitemp.h, we weren't consistent about which function alias we used to obtain the parameter names, with the risk that we would generate incorrect code like this: KEYWORD1 void KEYWORD2 NAME(Foo)(GLint x) { (void) x; DISPATCH(Foo, (x), (F, glFoo(%d);\n, x)); } KEYWORD1 void KEYWORD2 NAME(FooEXT)(GLint y) { (void) x; DISPATCH(Foo, (x), (F, glFooEXT(%d);\n, x)); } At the moment there are no aliased functions with mismatched parameter names, so this isn't the problem. But when we introduce GLES1 functions into the dispatch table, there will be (MapBufferRange/MapBufferRangeEXT). This patch paves the way for that by fixing the code generation script to handle the mismatch correctly. --- src/mapi/glapi/gen/gl_XML.py | 7 +-- src/mapi/glapi/gen/gl_apitemp.py | 2 +- 2 files changed, 6 insertions(+), 3 deletions(-) Reviewed-by: Chad Versace chad.vers...@linux.intel.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] mesa/vbo: Fix scaling issue in 10-bit signed normalized packing.
On Wed, Oct 31, 2012 at 7:45 PM, Kenneth Graunke kenn...@whitecape.org wrote: On 10/15/2012 09:45 AM, Ian Romanick wrote: On 10/14/2012 12:02 PM, Kenneth Graunke wrote: For the 10-bit components, the divisor was incorrect. A 10-bit signed integer can represent -2^9 through 2^9 - 1, which leads to the following ranges: (float)value.x - [ -512, 511] 2.0F * (float)value.x - [-1024, 1022] 2.0F * (float)value.x + 1.0F - [-1023, 1023] So dividing by 511 would incorrectly scale it to approximately: [-2.001956947, 2.001956947]. To correctly scale to [-1.0, 1.0], we need to divide by 1023. This is a very annoying part of the desktop GL specification: there are two different ways to convert 10-bit and 2-bit normalized integers to float. GLES3 fixes this, I believe, by having only one conversion. Can you double-check these changes against the GLES3 rules? Sigh. You're right...this code implements equation 2.2 f = (2c + 1)/(2^b - 1).(2.2) which is mandatory and correct for desktop OpenGL 4.1 and earlier. Starting with GL 4.2 and GLES 3, the spec changed to instead require equation 2.3: f = max{c/(2^(b-1) - 1), -1.0} (2.3) It also explicitly marks this as a change in behavior. So we'll need to conditionalize it based on the current context API and version, I guess. Ugh. Interesting. FYI, the piglit tests (draw-vertices-2101010 and attribs) currently expect that equation 2.2 is used. Marek ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] GL 3.1 on Radeon HD 4670?
The missing features for GL 3.1 are just UBOs and TBOs. Dave Airlie has been working on TBOs. Nobody is working on UBOs as far as I know. Both features need changes in the driver and in the common code (st/mesa). Marek On Wed, Oct 31, 2012 at 7:34 PM, Patrick Baggett baggett.patr...@gmail.com wrote: DOH. I'm sorry, I read that Mesa supported GL 3.1 and somehow I generalized that to all drivers. Thanks for that TODO list. I guess I need to start reading about the R700 architecture... Patrick On Wed, Oct 31, 2012 at 1:28 PM, Alex Deucher alexdeuc...@gmail.com wrote: On Wed, Oct 31, 2012 at 1:11 PM, Patrick Baggett baggett.patr...@gmail.com wrote: Hi all, I've got a really weird duck of system: an Itanium2 system running Linux 3.7.0-rc3 with the newest libdrm and mesa git from yesterday. I configured it with --enable-texture-float and the radeon DRI driver. When I use glxinfo, I see that it is Mesa 9.1-devel but only OpenGL 3.0. Is that because my version glxinfo doesn't create the appropriate context? Is there an updated version of glxinfo that does? Or a flag that I should pass to only consider core contexts? The open source r600g driver only supports GL 3.0 at the moment. See this document to see what's still missing: http://cgit.freedesktop.org/mesa/mesa/tree/docs/GL3.txt Alex ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 55998] Pretty huge slowdown in mesa 9.0
https://bugs.freedesktop.org/show_bug.cgi?id=55998 --- Comment #27 from dw...@tormail.org --- Okay, Arch Linux has now put out... kdebase-workspace 4.9.2-6 Which I guess has this patch applied. My system is now all up-to-date and there are no more Artifacts or slow down. However, the Tarring line is still there. -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/9] dispatch: Include GLES1-only functions in dispatch table.
Tapani, see the bottom of the message. On 10/30/2012 10:42 AM, Paul Berry wrote: Previously dispatch table-related code was generated from gl_API.xml, so it did not include slots for GLES1-only functions (such as those taking fixed-point arguments). This patch generates dispatch table-related code from gl_and_es_API.xml, so that GLES1-only functions are included. This paves the way for future patches that will unify the GLES1 dispatch table with the dispatch tables for the other APIs. The following generated files are affected: - glapi_x86.S - glapi_x86-64.S - glapi_sparc.S - glprocs.h - glapitemp.h - glapitable.h - glapi_gentable.c - dispatch.h - remap_helper.h Since this change affects makefiles, a full rebuild is required. --- src/mapi/glapi/SConscript | 6 +++--- src/mapi/glapi/gen/Makefile.am | 18 +- src/mapi/glapi/gen/SConscript | 10 +- 3 files changed, 17 insertions(+), 17 deletions(-) diff --git a/src/mapi/glapi/gen/Makefile.am b/src/mapi/glapi/gen/Makefile.am index 40aaf51..24bdbaf 100644 --- a/src/mapi/glapi/gen/Makefile.am +++ b/src/mapi/glapi/gen/Makefile.am @@ -187,27 +187,27 @@ $(MESA_GLAPI_DIR)/glapi_mapi_tmp.h: $(MESA_MAPI_DIR)/mapi_abi.py $(COMMON_ES) --printer glapi --mode lib $(srcdir)/gl_and_es_API.xml $@ I see a dependency bug. The rules below depend on list COMMON, which does not contain gl_and_es_API.xml. I think the best way to fix this is to remove the list COMMON_ES and fold its contents into COMMON. $(MESA_GLAPI_DIR)/glprocs.h: gl_procs.py $(COMMON) - $(PYTHON_GEN) $ -f $(srcdir)/gl_API.xml $@ + $(PYTHON_GEN) $ -f $(srcdir)/gl_and_es_API.xml $@ $(MESA_GLAPI_DIR)/glapitemp.h: gl_apitemp.py $(COMMON) - $(PYTHON_GEN) $ -f $(srcdir)/gl_API.xml $@ + $(PYTHON_GEN) $ -f $(srcdir)/gl_and_es_API.xml $@ $(MESA_GLAPI_DIR)/glapitable.h: gl_table.py $(COMMON) - $(PYTHON_GEN) $ -f $(srcdir)/gl_API.xml $@ + $(PYTHON_GEN) $ -f $(srcdir)/gl_and_es_API.xml $@ $(MESA_GLAPI_DIR)/glapi_gentable.c: gl_gentable.py $(COMMON) - $(PYTHON_GEN) $ -f $(srcdir)/gl_API.xml $@ + $(PYTHON_GEN) $ -f $(srcdir)/gl_and_es_API.xml $@ This patch doesn't touch the Android makefiles, and I think that's ok. Still, I want to verify with Tapani that this won't break the Android build before comitting. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 6/9] dispatch: Make a header to go along with querymatrix.c.
On 10/30/2012 10:42 AM, Paul Berry wrote: This patch creates a header querymatrix.h, to allow functions defined in querymatrix.c to be used from other .c files. It also switches from the nonstandard GL_APIENTRY to GLAPIENTRY. --- src/mesa/main/querymatrix.c | 12 +--- src/mesa/main/querymatrix.h | 39 +++ 2 files changed, 44 insertions(+), 7 deletions(-) create mode 100644 src/mesa/main/querymatrix.h diff --git a/src/mesa/main/querymatrix.c b/src/mesa/main/querymatrix.c index 2843d55..27842ae 100644 +extern void GLAPIENTRY _mesa_GetIntegerv(GLenum pname, GLint *params); +extern void GLAPIENTRY _mesa_GetFloatv(GLenum pname, GLfloat *params); I think the local declarations of the Get functions should be removed. Instead, just #include main/get.h. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 55998] Pretty huge slowdown in mesa 9.0
https://bugs.freedesktop.org/show_bug.cgi?id=55998 --- Comment #28 from ValdikSS i...@valdikss.org.ru --- For me, this fixes slowdown which I wrote in the first message. There is 2 other bugs: screen corruption with gles and incorrect screen repainting with gl. And I don't know if it's MESA or KDE bugs. -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/9] dispatch: Include GLES1-only functions in dispatch table.
On 31 October 2012 15:21, Chad Versace chad.vers...@linux.intel.com wrote: Tapani, see the bottom of the message. On 10/30/2012 10:42 AM, Paul Berry wrote: Previously dispatch table-related code was generated from gl_API.xml, so it did not include slots for GLES1-only functions (such as those taking fixed-point arguments). This patch generates dispatch table-related code from gl_and_es_API.xml, so that GLES1-only functions are included. This paves the way for future patches that will unify the GLES1 dispatch table with the dispatch tables for the other APIs. The following generated files are affected: - glapi_x86.S - glapi_x86-64.S - glapi_sparc.S - glprocs.h - glapitemp.h - glapitable.h - glapi_gentable.c - dispatch.h - remap_helper.h Since this change affects makefiles, a full rebuild is required. --- src/mapi/glapi/SConscript | 6 +++--- src/mapi/glapi/gen/Makefile.am | 18 +- src/mapi/glapi/gen/SConscript | 10 +- 3 files changed, 17 insertions(+), 17 deletions(-) diff --git a/src/mapi/glapi/gen/Makefile.am b/src/mapi/glapi/gen/Makefile.am index 40aaf51..24bdbaf 100644 --- a/src/mapi/glapi/gen/Makefile.am +++ b/src/mapi/glapi/gen/Makefile.am @@ -187,27 +187,27 @@ $(MESA_GLAPI_DIR)/glapi_mapi_tmp.h: $(MESA_MAPI_DIR)/mapi_abi.py $(COMMON_ES) --printer glapi --mode lib $(srcdir)/gl_and_es_API.xml $@ I see a dependency bug. The rules below depend on list COMMON, which does not contain gl_and_es_API.xml. I think the best way to fix this is to remove the list COMMON_ES and fold its contents into COMMON. Oops, good catch. I'll make that fix and send out a v2 of this patch. $(MESA_GLAPI_DIR)/glprocs.h: gl_procs.py $(COMMON) - $(PYTHON_GEN) $ -f $(srcdir)/gl_API.xml $@ + $(PYTHON_GEN) $ -f $(srcdir)/gl_and_es_API.xml $@ $(MESA_GLAPI_DIR)/glapitemp.h: gl_apitemp.py $(COMMON) - $(PYTHON_GEN) $ -f $(srcdir)/gl_API.xml $@ + $(PYTHON_GEN) $ -f $(srcdir)/gl_and_es_API.xml $@ $(MESA_GLAPI_DIR)/glapitable.h: gl_table.py $(COMMON) - $(PYTHON_GEN) $ -f $(srcdir)/gl_API.xml $@ + $(PYTHON_GEN) $ -f $(srcdir)/gl_and_es_API.xml $@ $(MESA_GLAPI_DIR)/glapi_gentable.c: gl_gentable.py $(COMMON) - $(PYTHON_GEN) $ -f $(srcdir)/gl_API.xml $@ + $(PYTHON_GEN) $ -f $(srcdir)/gl_and_es_API.xml $@ This patch doesn't touch the Android makefiles, and I think that's ok. Still, I want to verify with Tapani that this won't break the Android build before comitting. Yeah, I don't really understand how the Android build process handles these generated files--they don't seem to appear in any of the Android.mk files. Tapani, I'm hoping I can commit these patches tomorrow morning (pacific time) so I can send out another patch series that depends on them. Do you think you will have time to try the series out before then? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] GL 3.1 on Radeon HD 4670?
On Thu, Nov 1, 2012 at 8:11 AM, Marek Olšák mar...@gmail.com wrote: The missing features for GL 3.1 are just UBOs and TBOs. Dave Airlie has been working on TBOs. Nobody is working on UBOs as far as I know. Both features need changes in the driver and in the common code (st/mesa). I should probably push the state tracker tbo bits, they were fairly trivial, and I don't think the ubo bits are insanely hard either, also we need to check is there any reason we can't advertise GLSL 1.40. Dave. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/3] i965: Set dirty state for brw_draw_upload.c when num_instances changes.
On 10/31/2012 02:26 PM, Eric Anholt wrote: Otherwise, if we had a set of prims passed in with a num_instances varying between them, we wouldn't upload enough (or too much!) from user vertex arrays. --- src/mesa/drivers/dri/i965/brw_draw.c |5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_draw.c b/src/mesa/drivers/dri/i965/brw_draw.c index 1cfba29..22d18f9 100644 --- a/src/mesa/drivers/dri/i965/brw_draw.c +++ b/src/mesa/drivers/dri/i965/brw_draw.c @@ -474,7 +474,10 @@ static bool brw_try_draw_prims( struct gl_context *ctx, intel_batchbuffer_require_space(intel, estimated_max_prim_size, false); intel_batchbuffer_save_state(intel); - brw-num_instances = prim-num_instances; + if (brw-num_instances != prim-num_instances) { + brw-num_instances = prim-num_instances; + brw-state.dirty.brw |= BRW_NEW_VERTICES; + } if (intel-gen 6) brw_set_prim(brw, prim[i]); else I agree that BRW_NEW_VERTICES needs to be changed when num_instances changes. However, it's already unconditionally flagged above: brw-vb.min_index = min_index; brw-vb.max_index = max_index; brw-state.dirty.brw |= BRW_NEW_VERTICES; So I don't think you'll be able to observe any problems. That said, I approve of this change anyway, because it's self-documentating and guards against potential problems. For the series: Reviewed-by: Kenneth Graunke kenn...@whitecape.org ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/3] i965: Remove the vbo_rebase_prims() path.
On 10/31/2012 02:26 PM, Eric Anholt wrote: The brw_draw_upload.c start_vertex_bias code has support for doing the rebase without rewriting the index buffer by applying a basevertex. It looks like vbo_rebase_prims() is not equipped to handle basevertex. --- src/mesa/drivers/dri/i965/brw_draw.c | 21 ++--- 1 file changed, 6 insertions(+), 15 deletions(-) No performance data? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] r600g: fix abysmal performance in Reaction Quake
The problem was we set VRAM|GTT for relocations of STATIC resources. Setting just VRAM increases the framerate 4 times on my machine. I rewrote the switch statement and adjusted the domains for window framebuffers too. --- src/gallium/drivers/r600/r600_buffer.c | 42 --- src/gallium/drivers/r600/r600_texture.c |3 ++- 2 files changed, 24 insertions(+), 21 deletions(-) diff --git a/src/gallium/drivers/r600/r600_buffer.c b/src/gallium/drivers/r600/r600_buffer.c index f4566ee..116ab51 100644 --- a/src/gallium/drivers/r600/r600_buffer.c +++ b/src/gallium/drivers/r600/r600_buffer.c @@ -206,29 +206,31 @@ bool r600_init_resource(struct r600_screen *rscreen, { uint32_t initial_domain, domains; - /* Staging resources particpate in transfers and blits only -* and are used for uploads and downloads from regular -* resources. We generate them internally for some transfers. -*/ - if (usage == PIPE_USAGE_STAGING) { + switch(usage) { + case PIPE_USAGE_STAGING: + /* Staging resources participate in transfers, i.e. are used +* for uploads and downloads from regular resources. +* We generate them internally for some transfers. +*/ + initial_domain = RADEON_DOMAIN_GTT; domains = RADEON_DOMAIN_GTT; + break; + case PIPE_USAGE_DYNAMIC: + case PIPE_USAGE_STREAM: + /* Default to GTT, but allow the memory manager to move it to VRAM. */ initial_domain = RADEON_DOMAIN_GTT; - } else { domains = RADEON_DOMAIN_GTT | RADEON_DOMAIN_VRAM; - - switch(usage) { - case PIPE_USAGE_DYNAMIC: - case PIPE_USAGE_STREAM: - case PIPE_USAGE_STAGING: - initial_domain = RADEON_DOMAIN_GTT; - break; - case PIPE_USAGE_DEFAULT: - case PIPE_USAGE_STATIC: - case PIPE_USAGE_IMMUTABLE: - default: - initial_domain = RADEON_DOMAIN_VRAM; - break; - } + break; + case PIPE_USAGE_DEFAULT: + case PIPE_USAGE_STATIC: + case PIPE_USAGE_IMMUTABLE: + default: + /* Don't list GTT here, because the memory manager would put some +* resources to GTT no matter what the initial domain is. +* Not listing GTT in the domains improves performance a lot. */ + initial_domain = RADEON_DOMAIN_VRAM; + domains = RADEON_DOMAIN_VRAM; + break; } res-buf = rscreen-ws-buffer_create(rscreen-ws, size, alignment, bind, initial_domain); diff --git a/src/gallium/drivers/r600/r600_texture.c b/src/gallium/drivers/r600/r600_texture.c index 785eeff..2df390d 100644 --- a/src/gallium/drivers/r600/r600_texture.c +++ b/src/gallium/drivers/r600/r600_texture.c @@ -421,9 +421,10 @@ r600_texture_create_object(struct pipe_screen *screen, return NULL; } } else if (buf) { + /* This is usually the window framebuffer. We want it in VRAM, always. */ resource-buf = buf; resource-cs_buf = rscreen-ws-buffer_get_cs_handle(buf); - resource-domains = RADEON_DOMAIN_GTT | RADEON_DOMAIN_VRAM; + resource-domains = RADEON_DOMAIN_VRAM; } if (rtex-cmask_size) { -- 1.7.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2] dispatch: Include GLES1-only functions in dispatch table.
Previously dispatch table-related code was generated from gl_API.xml, so it did not include slots for GLES1-only functions (such as those taking fixed-point arguments). This patch generates dispatch table-related code from gl_and_es_API.xml, so that GLES1-only functions are included. This paves the way for future patches that will unify the GLES1 dispatch table with the dispatch tables for the other APIs. The following generated files are affected: - glapi_x86.S - glapi_x86-64.S - glapi_sparc.S - glprocs.h - glapitemp.h - glapitable.h - glapi_gentable.c - dispatch.h - remap_helper.h Since this change affects makefiles, a full rebuild is required. Reviewed-by: Kenneth Graunke kenn...@whitecape.org v2: Adjust dependencies to ensure that generated files will be rebuilt whenever any ES-related XML source files are changed. --- src/mapi/glapi/SConscript | 6 +++--- src/mapi/glapi/gen/Makefile.am | 31 --- src/mapi/glapi/gen/SConscript | 10 +- 3 files changed, 24 insertions(+), 23 deletions(-) diff --git a/src/mapi/glapi/SConscript b/src/mapi/glapi/SConscript index c336c25..153374c 100644 --- a/src/mapi/glapi/SConscript +++ b/src/mapi/glapi/SConscript @@ -61,7 +61,7 @@ if env['gcc'] and env['platform'] not in ('cygwin', 'darwin', 'windows'): env.CodeGenerate( target = 'glapi_x86.S', script = GLAPI + 'gen/gl_x86_asm.py', -source = GLAPI + 'gen/gl_API.xml', +source = GLAPI + 'gen/gl_and_es_API.xml', command = python_cmd + ' $SCRIPT -f $SOURCE $TARGET' ) elif env['machine'] == 'x86_64': @@ -74,7 +74,7 @@ if env['gcc'] and env['platform'] not in ('cygwin', 'darwin', 'windows'): env.CodeGenerate( target = 'glapi_x86-64.S', script = GLAPI + 'gen/gl_x86-64_asm.py', -source = GLAPI + 'gen/gl_API.xml', +source = GLAPI + 'gen/gl_and_es_API.xml', command = python_cmd + ' $SCRIPT -f $SOURCE $TARGET' ) elif env['machine'] == 'sparc': @@ -87,7 +87,7 @@ if env['gcc'] and env['platform'] not in ('cygwin', 'darwin', 'windows'): env.CodeGenerate( target = 'glapi_sparc.S', script = GLAPI + 'gen/gl_SPARC_asm.py', -source = GLAPI + 'gen/gl_API.xml', +source = GLAPI + 'gen/gl_and_es_API.xml', command = python_cmd + ' $SCRIPT -f $SOURCE $TARGET' ) else: diff --git a/src/mapi/glapi/gen/Makefile.am b/src/mapi/glapi/gen/Makefile.am index 40aaf51..14bb2df 100644 --- a/src/mapi/glapi/gen/Makefile.am +++ b/src/mapi/glapi/gen/Makefile.am @@ -135,10 +135,11 @@ API_XML = \ GL3x.xml -COMMON = $(API_XML) gl_XML.py glX_XML.py license.py typeexpr.py - -COMMON_ES = \ - $(COMMON) \ +COMMON = $(API_XML) \ + gl_XML.py \ + glX_XML.py \ + license.py \ + typeexpr.py \ gl_and_es_API.xml \ es_EXT.xml \ ARB_ES2_compatibility.xml \ @@ -182,43 +183,43 @@ $(XORG_GLAPI_DIR)/%.h: $(MESA_GLAPI_DIR)/%.h ## -$(MESA_GLAPI_DIR)/glapi_mapi_tmp.h: $(MESA_MAPI_DIR)/mapi_abi.py $(COMMON_ES) +$(MESA_GLAPI_DIR)/glapi_mapi_tmp.h: $(MESA_MAPI_DIR)/mapi_abi.py $(COMMON) $(PYTHON_GEN) $ \ --printer glapi --mode lib $(srcdir)/gl_and_es_API.xml $@ $(MESA_GLAPI_DIR)/glprocs.h: gl_procs.py $(COMMON) - $(PYTHON_GEN) $ -f $(srcdir)/gl_API.xml $@ + $(PYTHON_GEN) $ -f $(srcdir)/gl_and_es_API.xml $@ $(MESA_GLAPI_DIR)/glapitemp.h: gl_apitemp.py $(COMMON) - $(PYTHON_GEN) $ -f $(srcdir)/gl_API.xml $@ + $(PYTHON_GEN) $ -f $(srcdir)/gl_and_es_API.xml $@ $(MESA_GLAPI_DIR)/glapitable.h: gl_table.py $(COMMON) - $(PYTHON_GEN) $ -f $(srcdir)/gl_API.xml $@ + $(PYTHON_GEN) $ -f $(srcdir)/gl_and_es_API.xml $@ $(MESA_GLAPI_DIR)/glapi_gentable.c: gl_gentable.py $(COMMON) - $(PYTHON_GEN) $ -f $(srcdir)/gl_API.xml $@ + $(PYTHON_GEN) $ -f $(srcdir)/gl_and_es_API.xml $@ ## $(MESA_GLAPI_DIR)/glapi_x86.S: gl_x86_asm.py $(COMMON) - $(PYTHON_GEN) $ -f $(srcdir)/gl_API.xml $@ + $(PYTHON_GEN) $ -f $(srcdir)/gl_and_es_API.xml $@ $(MESA_GLAPI_DIR)/glapi_x86-64.S: gl_x86-64_asm.py $(COMMON) - $(PYTHON_GEN) $ -f $(srcdir)/gl_API.xml $@ + $(PYTHON_GEN) $ -f $(srcdir)/gl_and_es_API.xml $@ $(MESA_GLAPI_DIR)/glapi_sparc.S: gl_SPARC_asm.py $(COMMON) - $(PYTHON_GEN) $ -f $(srcdir)/gl_API.xml $@ + $(PYTHON_GEN) $ -f $(srcdir)/gl_and_es_API.xml $@ ## -$(MESA_DIR)/main/enums.c: gl_enums.py $(COMMON_ES) +$(MESA_DIR)/main/enums.c: gl_enums.py $(COMMON) $(PYTHON_GEN) $ -f $(srcdir)/gl_and_es_API.xml $@ $(MESA_DIR)/main/dispatch.h: gl_table.py $(COMMON) -
Re: [Mesa-dev] [PATCH 6/9] dispatch: Make a header to go along with querymatrix.c.
On 31 October 2012 15:27, Chad Versace chad.vers...@linux.intel.com wrote: On 10/30/2012 10:42 AM, Paul Berry wrote: This patch creates a header querymatrix.h, to allow functions defined in querymatrix.c to be used from other .c files. It also switches from the nonstandard GL_APIENTRY to GLAPIENTRY. --- src/mesa/main/querymatrix.c | 12 +--- src/mesa/main/querymatrix.h | 39 +++ 2 files changed, 44 insertions(+), 7 deletions(-) create mode 100644 src/mesa/main/querymatrix.h diff --git a/src/mesa/main/querymatrix.c b/src/mesa/main/querymatrix.c index 2843d55..27842ae 100644 +extern void GLAPIENTRY _mesa_GetIntegerv(GLenum pname, GLint *params); +extern void GLAPIENTRY _mesa_GetFloatv(GLenum pname, GLfloat *params); I think the local declarations of the Get functions should be removed. Instead, just #include main/get.h. Yeah, that seems reasonable. I'll make that change before pushing the series. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/3] i965: Set dirty state for brw_draw_upload.c when num_instances changes.
Kenneth Graunke kenn...@whitecape.org writes: On 10/31/2012 02:26 PM, Eric Anholt wrote: Otherwise, if we had a set of prims passed in with a num_instances varying between them, we wouldn't upload enough (or too much!) from user vertex arrays. --- src/mesa/drivers/dri/i965/brw_draw.c |5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_draw.c b/src/mesa/drivers/dri/i965/brw_draw.c index 1cfba29..22d18f9 100644 --- a/src/mesa/drivers/dri/i965/brw_draw.c +++ b/src/mesa/drivers/dri/i965/brw_draw.c @@ -474,7 +474,10 @@ static bool brw_try_draw_prims( struct gl_context *ctx, intel_batchbuffer_require_space(intel, estimated_max_prim_size, false); intel_batchbuffer_save_state(intel); - brw-num_instances = prim-num_instances; + if (brw-num_instances != prim-num_instances) { + brw-num_instances = prim-num_instances; + brw-state.dirty.brw |= BRW_NEW_VERTICES; + } if (intel-gen 6) brw_set_prim(brw, prim[i]); else I agree that BRW_NEW_VERTICES needs to be changed when num_instances changes. However, it's already unconditionally flagged above: brw-vb.min_index = min_index; brw-vb.max_index = max_index; brw-state.dirty.brw |= BRW_NEW_VERTICES; So I don't think you'll be able to observe any problems. That said, I approve of this change anyway, because it's self-documentating and guards against potential problems. That's outside the per-prim loop, though. pgpi6cB7cN3wP.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/3] i965: Remove the vbo_rebase_prims() path.
Kenneth Graunke kenn...@whitecape.org writes: On 10/31/2012 02:26 PM, Eric Anholt wrote: The brw_draw_upload.c start_vertex_bias code has support for doing the rebase without rewriting the index buffer by applying a basevertex. It looks like vbo_rebase_prims() is not equipped to handle basevertex. --- src/mesa/drivers/dri/i965/brw_draw.c | 21 ++--- 1 file changed, 6 insertions(+), 15 deletions(-) No performance data? Nope, I was removing it because the code was just broken for basevertex. pgpF6dtlFMR8X.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] r600g: fix abysmal performance in Reaction Quake
On Wed, Oct 31, 2012 at 8:05 PM, Marek Olšák mar...@gmail.com wrote: The problem was we set VRAM|GTT for relocations of STATIC resources. Setting just VRAM increases the framerate 4 times on my machine. I rewrote the switch statement and adjusted the domains for window framebuffers too. Reviewed-by: Alex Deucher alexander.deuc...@amd.com Stable branches? --- src/gallium/drivers/r600/r600_buffer.c | 42 --- src/gallium/drivers/r600/r600_texture.c |3 ++- 2 files changed, 24 insertions(+), 21 deletions(-) diff --git a/src/gallium/drivers/r600/r600_buffer.c b/src/gallium/drivers/r600/r600_buffer.c index f4566ee..116ab51 100644 --- a/src/gallium/drivers/r600/r600_buffer.c +++ b/src/gallium/drivers/r600/r600_buffer.c @@ -206,29 +206,31 @@ bool r600_init_resource(struct r600_screen *rscreen, { uint32_t initial_domain, domains; - /* Staging resources particpate in transfers and blits only -* and are used for uploads and downloads from regular -* resources. We generate them internally for some transfers. -*/ - if (usage == PIPE_USAGE_STAGING) { + switch(usage) { + case PIPE_USAGE_STAGING: + /* Staging resources participate in transfers, i.e. are used +* for uploads and downloads from regular resources. +* We generate them internally for some transfers. +*/ + initial_domain = RADEON_DOMAIN_GTT; domains = RADEON_DOMAIN_GTT; + break; + case PIPE_USAGE_DYNAMIC: + case PIPE_USAGE_STREAM: + /* Default to GTT, but allow the memory manager to move it to VRAM. */ initial_domain = RADEON_DOMAIN_GTT; - } else { domains = RADEON_DOMAIN_GTT | RADEON_DOMAIN_VRAM; - - switch(usage) { - case PIPE_USAGE_DYNAMIC: - case PIPE_USAGE_STREAM: - case PIPE_USAGE_STAGING: - initial_domain = RADEON_DOMAIN_GTT; - break; - case PIPE_USAGE_DEFAULT: - case PIPE_USAGE_STATIC: - case PIPE_USAGE_IMMUTABLE: - default: - initial_domain = RADEON_DOMAIN_VRAM; - break; - } + break; + case PIPE_USAGE_DEFAULT: + case PIPE_USAGE_STATIC: + case PIPE_USAGE_IMMUTABLE: + default: + /* Don't list GTT here, because the memory manager would put some +* resources to GTT no matter what the initial domain is. +* Not listing GTT in the domains improves performance a lot. */ + initial_domain = RADEON_DOMAIN_VRAM; + domains = RADEON_DOMAIN_VRAM; + break; } res-buf = rscreen-ws-buffer_create(rscreen-ws, size, alignment, bind, initial_domain); diff --git a/src/gallium/drivers/r600/r600_texture.c b/src/gallium/drivers/r600/r600_texture.c index 785eeff..2df390d 100644 --- a/src/gallium/drivers/r600/r600_texture.c +++ b/src/gallium/drivers/r600/r600_texture.c @@ -421,9 +421,10 @@ r600_texture_create_object(struct pipe_screen *screen, return NULL; } } else if (buf) { + /* This is usually the window framebuffer. We want it in VRAM, always. */ resource-buf = buf; resource-cs_buf = rscreen-ws-buffer_get_cs_handle(buf); - resource-domains = RADEON_DOMAIN_GTT | RADEON_DOMAIN_VRAM; + resource-domains = RADEON_DOMAIN_VRAM; } if (rtex-cmask_size) { -- 1.7.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/3] mesa: Fix the core GL genned-name handling for glBindBufferBase()/Range().
This is part of fixing gl-3.1/genned-names. --- src/mesa/main/bufferobj.c | 20 +--- 1 file changed, 13 insertions(+), 7 deletions(-) diff --git a/src/mesa/main/bufferobj.c b/src/mesa/main/bufferobj.c index ac58c99..2f43eb0 100644 --- a/src/mesa/main/bufferobj.c +++ b/src/mesa/main/bufferobj.c @@ -660,7 +660,7 @@ _mesa_free_buffer_objects( struct gl_context *ctx ) ctx-UniformBufferBindings = NULL; } -static void +static bool handle_bind_buffer_gen(struct gl_context *ctx, GLenum target, GLuint buffer, @@ -668,6 +668,11 @@ handle_bind_buffer_gen(struct gl_context *ctx, { struct gl_buffer_object *buf = *buf_handle; + if (!buf ctx-API == API_OPENGL_CORE) { + _mesa_error(ctx, GL_INVALID_OPERATION, glBindBuffer(non-gen name)); + return false; + } + if (!buf || buf == DummyBufferObject) { /* If this is a new buffer object id, or one which was generated but * never used before, allocate a buffer object now. @@ -681,6 +686,8 @@ handle_bind_buffer_gen(struct gl_context *ctx, _mesa_HashInsert(ctx-Shared-BufferObjects, buffer, buf); *buf_handle = buf; } + + return true; } /** @@ -717,11 +724,8 @@ bind_buffer_object(struct gl_context *ctx, GLenum target, GLuint buffer) else { /* non-default buffer object */ newBufObj = _mesa_lookup_bufferobj(ctx, buffer); - if (newBufObj == NULL ctx-API == API_OPENGL_CORE) { - _mesa_error(ctx, GL_INVALID_OPERATION, glBindBuffer(non-gen name)); + if (!handle_bind_buffer_gen(ctx, target, buffer, newBufObj)) return; - } - handle_bind_buffer_gen(ctx, target, buffer, newBufObj); } /* bind new buffer */ @@ -2147,7 +2151,8 @@ _mesa_BindBufferRange(GLenum target, GLuint index, } else { bufObj = _mesa_lookup_bufferobj(ctx, buffer); } - handle_bind_buffer_gen(ctx, target, buffer, bufObj); + if (!handle_bind_buffer_gen(ctx, target, buffer, bufObj)) + return; if (!bufObj) { _mesa_error(ctx, GL_INVALID_OPERATION, @@ -2193,7 +2198,8 @@ _mesa_BindBufferBase(GLenum target, GLuint index, GLuint buffer) } else { bufObj = _mesa_lookup_bufferobj(ctx, buffer); } - handle_bind_buffer_gen(ctx, target, buffer, bufObj); + if (!handle_bind_buffer_gen(ctx, target, buffer, bufObj)) + return; if (!bufObj) { _mesa_error(ctx, GL_INVALID_OPERATION, -- 1.7.10.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/3] mesa: Use non-gen name more consistently as an error message in GL core.
I used this to help verify that my test was actually testing the paths I wanted to. --- src/mesa/main/arrayobj.c |2 +- src/mesa/main/texobj.c |2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/src/mesa/main/arrayobj.c b/src/mesa/main/arrayobj.c index 5959260..926c753 100644 --- a/src/mesa/main/arrayobj.c +++ b/src/mesa/main/arrayobj.c @@ -359,7 +359,7 @@ bind_vertex_array(struct gl_context *ctx, GLuint id, GLboolean genRequired) newObj = lookup_arrayobj(ctx, id); if (!newObj) { if (genRequired) { -_mesa_error(ctx, GL_INVALID_OPERATION, glBindVertexArray(id)); +_mesa_error(ctx, GL_INVALID_OPERATION, glBindVertexArray(non-gen name)); return; } diff --git a/src/mesa/main/texobj.c b/src/mesa/main/texobj.c index 224d8a8..8525ff9 100644 --- a/src/mesa/main/texobj.c +++ b/src/mesa/main/texobj.c @@ -1220,7 +1220,7 @@ _mesa_BindTexture( GLenum target, GLuint texName ) } else { if (ctx-API == API_OPENGL_CORE) { -_mesa_error(ctx, GL_INVALID_OPERATION, glBindTexture); +_mesa_error(ctx, GL_INVALID_OPERATION, glBindTexture(non-gen name)); return; } -- 1.7.10.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/3] mesa: Fix core GL genned-name handling for glBeginQuery().
Fixes piglit gl-3.1/genned-names. --- src/mesa/main/queryobj.c | 15 ++- 1 file changed, 10 insertions(+), 5 deletions(-) diff --git a/src/mesa/main/queryobj.c b/src/mesa/main/queryobj.c index d216913..2a39176 100644 --- a/src/mesa/main/queryobj.c +++ b/src/mesa/main/queryobj.c @@ -321,13 +321,18 @@ _mesa_BeginQueryIndexed(GLenum target, GLuint index, GLuint id) q = _mesa_lookup_query_object(ctx, id); if (!q) { - /* create new object */ - q = ctx-Driver.NewQueryObject(ctx, id); - if (!q) { - _mesa_error(ctx, GL_OUT_OF_MEMORY, glBeginQuery{Indexed}); + if (ctx-API == API_OPENGL_CORE) { + _mesa_error(ctx, GL_INVALID_OPERATION, glBeginQuery(non-gen name)); return; + } else { + /* create new object */ + q = ctx-Driver.NewQueryObject(ctx, id); + if (!q) { +_mesa_error(ctx, GL_OUT_OF_MEMORY, glBeginQuery{Indexed}); +return; + } + _mesa_HashInsert(ctx-Query.QueryObjects, id, q); } - _mesa_HashInsert(ctx-Query.QueryObjects, id, q); } else { /* pre-existing object */ -- 1.7.10.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] r600g: fix abysmal performance in Reaction Quake
On Wed, Oct 31, 2012 at 8:05 PM, Marek Olšák mar...@gmail.com wrote: The problem was we set VRAM|GTT for relocations of STATIC resources. Setting just VRAM increases the framerate 4 times on my machine. I rewrote the switch statement and adjusted the domains for window framebuffers too. Reviewed-by: Jerome Glisse jgli...@redhat.com --- src/gallium/drivers/r600/r600_buffer.c | 42 --- src/gallium/drivers/r600/r600_texture.c |3 ++- 2 files changed, 24 insertions(+), 21 deletions(-) diff --git a/src/gallium/drivers/r600/r600_buffer.c b/src/gallium/drivers/r600/r600_buffer.c index f4566ee..116ab51 100644 --- a/src/gallium/drivers/r600/r600_buffer.c +++ b/src/gallium/drivers/r600/r600_buffer.c @@ -206,29 +206,31 @@ bool r600_init_resource(struct r600_screen *rscreen, { uint32_t initial_domain, domains; - /* Staging resources particpate in transfers and blits only -* and are used for uploads and downloads from regular -* resources. We generate them internally for some transfers. -*/ - if (usage == PIPE_USAGE_STAGING) { + switch(usage) { + case PIPE_USAGE_STAGING: + /* Staging resources participate in transfers, i.e. are used +* for uploads and downloads from regular resources. +* We generate them internally for some transfers. +*/ + initial_domain = RADEON_DOMAIN_GTT; domains = RADEON_DOMAIN_GTT; + break; + case PIPE_USAGE_DYNAMIC: + case PIPE_USAGE_STREAM: + /* Default to GTT, but allow the memory manager to move it to VRAM. */ initial_domain = RADEON_DOMAIN_GTT; - } else { domains = RADEON_DOMAIN_GTT | RADEON_DOMAIN_VRAM; - - switch(usage) { - case PIPE_USAGE_DYNAMIC: - case PIPE_USAGE_STREAM: - case PIPE_USAGE_STAGING: - initial_domain = RADEON_DOMAIN_GTT; - break; - case PIPE_USAGE_DEFAULT: - case PIPE_USAGE_STATIC: - case PIPE_USAGE_IMMUTABLE: - default: - initial_domain = RADEON_DOMAIN_VRAM; - break; - } + break; + case PIPE_USAGE_DEFAULT: + case PIPE_USAGE_STATIC: + case PIPE_USAGE_IMMUTABLE: + default: + /* Don't list GTT here, because the memory manager would put some +* resources to GTT no matter what the initial domain is. +* Not listing GTT in the domains improves performance a lot. */ + initial_domain = RADEON_DOMAIN_VRAM; + domains = RADEON_DOMAIN_VRAM; + break; } res-buf = rscreen-ws-buffer_create(rscreen-ws, size, alignment, bind, initial_domain); diff --git a/src/gallium/drivers/r600/r600_texture.c b/src/gallium/drivers/r600/r600_texture.c index 785eeff..2df390d 100644 --- a/src/gallium/drivers/r600/r600_texture.c +++ b/src/gallium/drivers/r600/r600_texture.c @@ -421,9 +421,10 @@ r600_texture_create_object(struct pipe_screen *screen, return NULL; } } else if (buf) { + /* This is usually the window framebuffer. We want it in VRAM, always. */ resource-buf = buf; resource-cs_buf = rscreen-ws-buffer_get_cs_handle(buf); - resource-domains = RADEON_DOMAIN_GTT | RADEON_DOMAIN_VRAM; + resource-domains = RADEON_DOMAIN_VRAM; } if (rtex-cmask_size) { -- 1.7.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/3] i965: Set dirty state for brw_draw_upload.c when num_instances changes.
On 10/31/2012 06:13 PM, Eric Anholt wrote: Kenneth Graunke kenn...@whitecape.org writes: On 10/31/2012 02:26 PM, Eric Anholt wrote: Otherwise, if we had a set of prims passed in with a num_instances varying between them, we wouldn't upload enough (or too much!) from user vertex arrays. --- src/mesa/drivers/dri/i965/brw_draw.c |5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_draw.c b/src/mesa/drivers/dri/i965/brw_draw.c index 1cfba29..22d18f9 100644 --- a/src/mesa/drivers/dri/i965/brw_draw.c +++ b/src/mesa/drivers/dri/i965/brw_draw.c @@ -474,7 +474,10 @@ static bool brw_try_draw_prims( struct gl_context *ctx, intel_batchbuffer_require_space(intel, estimated_max_prim_size, false); intel_batchbuffer_save_state(intel); - brw-num_instances = prim-num_instances; + if (brw-num_instances != prim-num_instances) { + brw-num_instances = prim-num_instances; + brw-state.dirty.brw |= BRW_NEW_VERTICES; + } if (intel-gen 6) brw_set_prim(brw, prim[i]); else I agree that BRW_NEW_VERTICES needs to be changed when num_instances changes. However, it's already unconditionally flagged above: brw-vb.min_index = min_index; brw-vb.max_index = max_index; brw-state.dirty.brw |= BRW_NEW_VERTICES; So I don't think you'll be able to observe any problems. That said, I approve of this change anyway, because it's self-documentating and guards against potential problems. That's outside the per-prim loop, though. Oh dear. Yes, brw_upload_state() is inside the loop, so it clears the BRW_NEW_VERTICES flag. So your change is necessary. My mistake. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] r600g: fix abysmal performance in Reaction Quake
On Thu, Nov 1, 2012 at 2:13 AM, Alex Deucher alexdeuc...@gmail.com wrote: On Wed, Oct 31, 2012 at 8:05 PM, Marek Olšák mar...@gmail.com wrote: The problem was we set VRAM|GTT for relocations of STATIC resources. Setting just VRAM increases the framerate 4 times on my machine. I rewrote the switch statement and adjusted the domains for window framebuffers too. Reviewed-by: Alex Deucher alexander.deuc...@amd.com Stable branches? Yes, good idea. Marek ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] i965/gen4: Fix assertion failures in depthstencil piglit tests.
Don't forget to set depth_mt even if !hiz_mt. --- src/mesa/drivers/dri/i965/brw_misc_state.c |9 + 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_misc_state.c b/src/mesa/drivers/dri/i965/brw_misc_state.c index ce23fa0..d4b4c75 100644 --- a/src/mesa/drivers/dri/i965/brw_misc_state.c +++ b/src/mesa/drivers/dri/i965/brw_misc_state.c @@ -416,11 +416,12 @@ static void emit_depthbuffer(struct brw_context *brw) unsigned int len; bool separate_stencil = false; - if (depth_irb - depth_irb-mt - depth_irb-mt-hiz_mt) { + if (depth_irb){ depth_mt = depth_irb-mt; - hiz_region = depth_irb-mt-hiz_mt-region; + if (depth_mt + depth_mt-hiz_mt) { + hiz_region = depth_irb-mt-hiz_mt-region; + } } /* 3DSTATE_DEPTH_BUFFER, 3DSTATE_STENCIL_BUFFER are both -- 1.7.10.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965/gen4: Fix assertion failures in depthstencil piglit tests.
On 10/31/2012 07:25 PM, Eric Anholt wrote: Don't forget to set depth_mt even if !hiz_mt. --- src/mesa/drivers/dri/i965/brw_misc_state.c |9 + 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_misc_state.c b/src/mesa/drivers/dri/i965/brw_misc_state.c index ce23fa0..d4b4c75 100644 --- a/src/mesa/drivers/dri/i965/brw_misc_state.c +++ b/src/mesa/drivers/dri/i965/brw_misc_state.c @@ -416,11 +416,12 @@ static void emit_depthbuffer(struct brw_context *brw) unsigned int len; bool separate_stencil = false; - if (depth_irb - depth_irb-mt - depth_irb-mt-hiz_mt) { + if (depth_irb){ depth_mt = depth_irb-mt; - hiz_region = depth_irb-mt-hiz_mt-region; + if (depth_mt + depth_mt-hiz_mt) { + hiz_region = depth_irb-mt-hiz_mt-region; + } } /* 3DSTATE_DEPTH_BUFFER, 3DSTATE_STENCIL_BUFFER are both Oh geez. Yeah, that's obviously necessary. Reviewed-by: Kenneth Graunke kenn...@whitecape.org At some point, I think it would be wise to split gen4/5 out into their own function, as it doesn't have either HiZ or separate stencil, and could be a lot simpler. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev