[Mesa-dev] [PATCH 0/1] swr: Remove unnecessary memset call
This patch is identical to the one I sent in April, but that time I forgot to mention that I didn't have commit rights, so I am sending the rebased patch for the review. Vlad Golovkin (1): swr: Remove unnecessary memset call src/gallium/drivers/swr/swr_screen.cpp | 1 - 1 file changed, 1 deletion(-) -- 2.17.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/1] swr: Remove unnecessary memset call
Zeroing memory after calloc is not necessary. This also allows to avoid possible crash when allocation fails, because memset is called before checking screen for NULL. --- src/gallium/drivers/swr/swr_screen.cpp | 1 - 1 file changed, 1 deletion(-) diff --git a/src/gallium/drivers/swr/swr_screen.cpp b/src/gallium/drivers/swr/swr_screen.cpp index fa232b6838..67085444f8 100644 --- a/src/gallium/drivers/swr/swr_screen.cpp +++ b/src/gallium/drivers/swr/swr_screen.cpp @@ -1148,7 +1148,6 @@ struct pipe_screen * swr_create_screen_internal(struct sw_winsys *winsys) { struct swr_screen *screen = CALLOC_STRUCT(swr_screen); - memset(screen, 0, sizeof(struct swr_screen)); if (!screen) return NULL; -- 2.17.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/1] swr: Remove unnecessary memset call
Zeroing memory after calloc is not necessary. This also allows to avoid possible crash when allocation fails, because memset is called before checking screen for NULL. --- src/gallium/drivers/swr/swr_screen.cpp | 1 - 1 file changed, 1 deletion(-) diff --git a/src/gallium/drivers/swr/swr_screen.cpp b/src/gallium/drivers/swr/swr_screen.cpp index fa232b6838..67085444f8 100644 --- a/src/gallium/drivers/swr/swr_screen.cpp +++ b/src/gallium/drivers/swr/swr_screen.cpp @@ -1148,7 +1148,6 @@ struct pipe_screen * swr_create_screen_internal(struct sw_winsys *winsys) { struct swr_screen *screen = CALLOC_STRUCT(swr_screen); - memset(screen, 0, sizeof(struct swr_screen)); if (!screen) return NULL; -- 2.17.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 0/1] swr: Remove unnecessary memset call
This patch is identical to the previous one, but that time I forgot to mention that I didn't have commit rights, so I am sending the rebased patch for the review. Vlad Golovkin (1): swr: Remove unnecessary memset call src/gallium/drivers/swr/swr_screen.cpp | 1 - 1 file changed, 1 deletion(-) -- 2.17.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] glsl/glcpp: Handle hex constants with 0X prefix
2018-04-23 3:53 GMT+03:00 Timothy Arceri <tarc...@itsqueeze.com>: > > > On 20/04/18 06:08, Vlad Golovkin wrote: >> >> GLSL 4.6 spec describes hex constant as: >> >> hexadecimal-constant: >> 0x hexadecimal-digit >> 0X hexadecimal-digit >> hexadecimal-constant hexadecimal-digit >> >> Right now if you have a shader with the following structure: >> >> #if 0X1 // or any hex number with the 0X prefix >> // some code >> #endif >> >> the code between #if and #endif gets removed because the checking is >> performed >> only for "0x" prefix which results in strtoll being called with the base 8 >> and >> after encountering the 'X' char the strtoll returns 0. Letting strtoll >> detect >> the base makes this limitation go away and also makes code easier to read. > > > The problem is strtoll supports much more than what GLSL allows. > >> >> Also added 1 test for uppercase hex prefix. >> --- >> src/compiler/glsl/glcpp/glcpp-parse.y| 9 >> ++--- >> src/compiler/glsl/glcpp/tests/149-hex-const-uppercase-prefix.c | 5 >> + >> .../glsl/glcpp/tests/149-hex-const-uppercase-prefix.c.expected | 5 >> + >> 3 files changed, 12 insertions(+), 7 deletions(-) >> create mode 100644 >> src/compiler/glsl/glcpp/tests/149-hex-const-uppercase-prefix.c >> create mode 100644 >> src/compiler/glsl/glcpp/tests/149-hex-const-uppercase-prefix.c.expected >> >> diff --git a/src/compiler/glsl/glcpp/glcpp-parse.y >> b/src/compiler/glsl/glcpp/glcpp-parse.y >> index ccb3aa18d3..d83f99f1c7 100644 >> --- a/src/compiler/glsl/glcpp/glcpp-parse.y >> +++ b/src/compiler/glsl/glcpp/glcpp-parse.y >> @@ -462,13 +462,8 @@ control_line_error: >> integer_constant: >> INTEGER_STRING { >> - if (strlen ($1) >= 3 && strncmp ($1, "0x", 2) == 0) { > > > > As per my coment above strtoll supports much more than what GLSL allows. > Please just add strncmp($1, "0X", 2) == 0 to the if above. Flex and Bison is not my strongest suit, so correct me if I am wrong, but it seems that INTEGER_STRING can't contain leading whitespace and/or leading plus/minus that strtoll supports. As for the multitude of bases that strtoll supports, it would ignore them and just choose between 8, 10 and 16 when base = 0. I think I should have made it more clear in the code comment. > > That patch would have my r-b. If you send that fixed up patch I can push it > for you. Thanks for looking into this. > > > >> - $$ = strtoll ($1 + 2, NULL, 16); >> - } else if ($1[0] == '0') { >> - $$ = strtoll ($1, NULL, 8); >> - } else { >> - $$ = strtoll ($1, NULL, 10); >> - } >> + /* let strtoll detect the base */ >> + $$ = strtoll ($1, NULL, 0); >> } >> | INTEGER { >> $$ = $1; >> diff --git >> a/src/compiler/glsl/glcpp/tests/149-hex-const-uppercase-prefix.c >> b/src/compiler/glsl/glcpp/tests/149-hex-const-uppercase-prefix.c >> new file mode 100644 >> index 00..1be9b28eb7 >> --- /dev/null >> +++ b/src/compiler/glsl/glcpp/tests/149-hex-const-uppercase-prefix.c >> @@ -0,0 +1,5 @@ >> +#if 0x1234abcd == 0X1234abcd >> +success >> +#else >> +failure >> +#endif >> diff --git >> a/src/compiler/glsl/glcpp/tests/149-hex-const-uppercase-prefix.c.expected >> b/src/compiler/glsl/glcpp/tests/149-hex-const-uppercase-prefix.c.expected >> new file mode 100644 >> index 00..4cf250f6bb >> --- /dev/null >> +++ >> b/src/compiler/glsl/glcpp/tests/149-hex-const-uppercase-prefix.c.expected >> @@ -0,0 +1,5 @@ >> + >> +success >> + >> + >> + >> > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] glsl/glcpp: Handle hex constants with 0X prefix
GLSL 4.6 spec describes hex constant as: hexadecimal-constant: 0x hexadecimal-digit 0X hexadecimal-digit hexadecimal-constant hexadecimal-digit Right now if you have a shader with the following structure: #if 0X1 // or any hex number with the 0X prefix // some code #endif the code between #if and #endif gets removed because the checking is performed only for "0x" prefix which results in strtoll being called with the base 8 and after encountering the 'X' char the strtoll returns 0. Letting strtoll detect the base makes this limitation go away and also makes code easier to read. Also added 1 test for uppercase hex prefix. --- src/compiler/glsl/glcpp/glcpp-parse.y| 9 ++--- src/compiler/glsl/glcpp/tests/149-hex-const-uppercase-prefix.c | 5 + .../glsl/glcpp/tests/149-hex-const-uppercase-prefix.c.expected | 5 + 3 files changed, 12 insertions(+), 7 deletions(-) create mode 100644 src/compiler/glsl/glcpp/tests/149-hex-const-uppercase-prefix.c create mode 100644 src/compiler/glsl/glcpp/tests/149-hex-const-uppercase-prefix.c.expected diff --git a/src/compiler/glsl/glcpp/glcpp-parse.y b/src/compiler/glsl/glcpp/glcpp-parse.y index ccb3aa18d3..d83f99f1c7 100644 --- a/src/compiler/glsl/glcpp/glcpp-parse.y +++ b/src/compiler/glsl/glcpp/glcpp-parse.y @@ -462,13 +462,8 @@ control_line_error: integer_constant: INTEGER_STRING { - if (strlen ($1) >= 3 && strncmp ($1, "0x", 2) == 0) { - $$ = strtoll ($1 + 2, NULL, 16); - } else if ($1[0] == '0') { - $$ = strtoll ($1, NULL, 8); - } else { - $$ = strtoll ($1, NULL, 10); - } + /* let strtoll detect the base */ + $$ = strtoll ($1, NULL, 0); } | INTEGER { $$ = $1; diff --git a/src/compiler/glsl/glcpp/tests/149-hex-const-uppercase-prefix.c b/src/compiler/glsl/glcpp/tests/149-hex-const-uppercase-prefix.c new file mode 100644 index 00..1be9b28eb7 --- /dev/null +++ b/src/compiler/glsl/glcpp/tests/149-hex-const-uppercase-prefix.c @@ -0,0 +1,5 @@ +#if 0x1234abcd == 0X1234abcd +success +#else +failure +#endif diff --git a/src/compiler/glsl/glcpp/tests/149-hex-const-uppercase-prefix.c.expected b/src/compiler/glsl/glcpp/tests/149-hex-const-uppercase-prefix.c.expected new file mode 100644 index 00..4cf250f6bb --- /dev/null +++ b/src/compiler/glsl/glcpp/tests/149-hex-const-uppercase-prefix.c.expected @@ -0,0 +1,5 @@ + +success + + + -- 2.14.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] Fwd: [PATCH] mesa/math: Allocate memory for GLmatrix elements and its inverse contiguously
-- Forwarded message -- From: Vlad Golovkin <vlad.golovkin.m...@gmail.com> Date: 2018-04-19 21:06 GMT+03:00 Subject: Re: [Mesa-dev] [PATCH] mesa/math: Allocate memory for GLmatrix elements and its inverse contiguously To: Thomas Helland <thomashellan...@gmail.com> 2018-04-17 8:55 GMT+03:00 Thomas Helland <thomashellan...@gmail.com>: > Hi, and thanks for the patch =) > > Have you done any performance testing on this to verify it > gives us a speedup of any kind? I'm asking because it seems like > this might be something that a decent compiler should be able to do. > Performance related patches, at least in core mesa, usually have > some justification with benchmark numbers in the commit message. Hi, I examined the resulting assembly for these 3 functions and it turns out that compiler wasn't merging these two blocks of memory into one (which compiler does that?). gcc tried to unroll memcpys to a series of movs which may seem to partially defeat the purpose of this patch, but after copying the block corresponding to m->m it had to switch destination and source registers to the next block resulting in 2 wasted movs. As a result we can save malloc and free call (in _math_matrix_ctr and _math_matrix_dtr) and 2 movs (when compiler tries to avoid memcpy - best case) or 1 memcpy call (in the worst case). It may seem that 2nd malloc can place m->inv in memory right after m->m but: 1) compiler can't rely on that behaviour 2) allocator will insert some private data before each block leading to more cache misses. I made some testing with Torcs and Yo Frankie blender game and according to perf in Yo Frankie _math_matrix_copy overhead reduced by 0.03% - 0.04% while Torcs didn't see any improvement. Sorry for the duplicate emails. > Some style comments below > > 2018-04-17 1:03 GMT+02:00 Vlad Golovkin <vlad.golovkin.m...@gmail.com>: >> When GLmatrix elements and its inverse are stored contiguously in memory it >> is possible to >> allocate, free and copy these fields with 1 function call instead of 2. >> --- >> src/mesa/math/m_matrix.c | 15 +-- >> 1 file changed, 9 insertions(+), 6 deletions(-) >> >> diff --git a/src/mesa/math/m_matrix.c b/src/mesa/math/m_matrix.c >> index 57a49533de..4ab78a1fb3 100644 >> --- a/src/mesa/math/m_matrix.c >> +++ b/src/mesa/math/m_matrix.c >> @@ -1438,8 +1438,7 @@ _math_matrix_is_dirty( const GLmatrix *m ) >> void >> _math_matrix_copy( GLmatrix *to, const GLmatrix *from ) >> { >> - memcpy(to->m, from->m, 16 * sizeof(GLfloat)); >> - memcpy(to->inv, from->inv, 16 * sizeof(GLfloat)); >> + memcpy(to->m, from->m, 16 * 2 * sizeof(GLfloat)); >> to->flags = from->flags; >> to->type = from->type; >> } >> @@ -1470,12 +1469,17 @@ _math_matrix_loadf( GLmatrix *mat, const GLfloat *m ) >> void >> _math_matrix_ctr( GLmatrix *m ) >> { >> - m->m = _mesa_align_malloc( 16 * sizeof(GLfloat), 16 ); >> + m->m = _mesa_align_malloc( 16 * 2 * sizeof(GLfloat), 16 ); >> if (m->m) >> + { > > Our style guides says to keep the curly bracket after an if on the same line. > >> + m->inv = m->m + 16; >>memcpy( m->m, Identity, sizeof(Identity) ); >> - m->inv = _mesa_align_malloc( 16 * sizeof(GLfloat), 16 ); >> - if (m->inv) >>memcpy( m->inv, Identity, sizeof(Identity) ); >> + } >> + else >> + { > > } else { > > Although I see that this file defaults to; > > { > else { > > for some reason. Feel free to follow existing style, or adjust to my comments. > Also, if we want to do this change it deserves a comment in the source. >> + m->inv = NULL; >> + } >> m->type = MATRIX_IDENTITY; >> m->flags = 0; >> } >> @@ -1493,7 +1497,6 @@ _math_matrix_dtr( GLmatrix *m ) >> _mesa_align_free( m->m ); >> m->m = NULL; >> >> - _mesa_align_free( m->inv ); >> m->inv = NULL; >> } >> >> -- >> 2.14.1 >> >> ___ >> mesa-dev mailing list >> mesa-dev@lists.freedesktop.org >> https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] mesa/math: Allocate memory for GLmatrix elements and its inverse contiguously
When GLmatrix elements and its inverse are stored contiguously in memory it is possible to allocate, free and copy these fields with 1 function call instead of 2. --- src/mesa/math/m_matrix.c | 15 +-- 1 file changed, 9 insertions(+), 6 deletions(-) diff --git a/src/mesa/math/m_matrix.c b/src/mesa/math/m_matrix.c index 57a49533de..4ab78a1fb3 100644 --- a/src/mesa/math/m_matrix.c +++ b/src/mesa/math/m_matrix.c @@ -1438,8 +1438,7 @@ _math_matrix_is_dirty( const GLmatrix *m ) void _math_matrix_copy( GLmatrix *to, const GLmatrix *from ) { - memcpy(to->m, from->m, 16 * sizeof(GLfloat)); - memcpy(to->inv, from->inv, 16 * sizeof(GLfloat)); + memcpy(to->m, from->m, 16 * 2 * sizeof(GLfloat)); to->flags = from->flags; to->type = from->type; } @@ -1470,12 +1469,17 @@ _math_matrix_loadf( GLmatrix *mat, const GLfloat *m ) void _math_matrix_ctr( GLmatrix *m ) { - m->m = _mesa_align_malloc( 16 * sizeof(GLfloat), 16 ); + m->m = _mesa_align_malloc( 16 * 2 * sizeof(GLfloat), 16 ); if (m->m) + { + m->inv = m->m + 16; memcpy( m->m, Identity, sizeof(Identity) ); - m->inv = _mesa_align_malloc( 16 * sizeof(GLfloat), 16 ); - if (m->inv) memcpy( m->inv, Identity, sizeof(Identity) ); + } + else + { + m->inv = NULL; + } m->type = MATRIX_IDENTITY; m->flags = 0; } @@ -1493,7 +1497,6 @@ _math_matrix_dtr( GLmatrix *m ) _mesa_align_free( m->m ); m->m = NULL; - _mesa_align_free( m->inv ); m->inv = NULL; } -- 2.14.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] nv50: Extract needed value bits without shifting them before calling bitcount
This can save one instruction since bitcount doesn't care about specific bits' positions. --- src/gallium/drivers/nouveau/nv50/nv50_screen.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/gallium/drivers/nouveau/nv50/nv50_screen.c b/src/gallium/drivers/nouveau/nv50/nv50_screen.c index 6fd2982e3c..98eb0309ad 100644 --- a/src/gallium/drivers/nouveau/nv50/nv50_screen.c +++ b/src/gallium/drivers/nouveau/nv50/nv50_screen.c @@ -982,7 +982,7 @@ nv50_screen_create(struct nouveau_device *dev) nouveau_getparam(dev, NOUVEAU_GETPARAM_GRAPH_UNITS, ); screen->TPs = util_bitcount(value & 0x); - screen->MPsInTP = util_bitcount((value >> 24) & 0xf); + screen->MPsInTP = util_bitcount(value & 0x0f00); screen->mp_count = screen->TPs * screen->MPsInTP; -- 2.14.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] swr: Remove unnecessary memset call
Zeroing memory after calloc is not necessary. This also allows to avoid possible crash when allocation fails, because memset is called before checking screen for NULL. --- src/gallium/drivers/swr/swr_screen.cpp | 1 - 1 file changed, 1 deletion(-) diff --git a/src/gallium/drivers/swr/swr_screen.cpp b/src/gallium/drivers/swr/swr_screen.cpp index 880a177c39..4e43ac55fb 100644 --- a/src/gallium/drivers/swr/swr_screen.cpp +++ b/src/gallium/drivers/swr/swr_screen.cpp @@ -1130,7 +1130,6 @@ struct pipe_screen * swr_create_screen_internal(struct sw_winsys *winsys) { struct swr_screen *screen = CALLOC_STRUCT(swr_screen); - memset(screen, 0, sizeof(struct swr_screen)); if (!screen) return NULL; -- 2.14.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2] util: remove redundant check for the __clang__ macro
2018-02-06 17:19 GMT+02:00 Brian Paul <bri...@vmware.com>: > On 02/06/2018 06:48 AM, Vlad Golovkin wrote: >> >> Clang defines __GNUC__ macro, so one doesn't need to check __clang__ >> macro in this particular case. >> >> v2: added comment as per Brian Paul's suggestion >> --- >> src/util/macros.h | 3 ++- >> 1 file changed, 2 insertions(+), 1 deletion(-) >> >> diff --git a/src/util/macros.h b/src/util/macros.h >> index 432d513930..e3c785af50 100644 >> --- a/src/util/macros.h >> +++ b/src/util/macros.h >> @@ -137,8 +137,9 @@ do { \ >> #endif >> /* Forced function inlining */ >> +/* Note: Clang also sets __GNUC__ (see other cases below) */ >> #ifndef ALWAYS_INLINE >> -# if defined(__GNUC__) || defined(__clang__) >> +# if defined(__GNUC__) >> #define ALWAYS_INLINE inline __attribute__((always_inline)) >> # elif defined(_MSC_VER) >> #define ALWAYS_INLINE __forceinline >> > > Thanks! > > Reviewed-by: Brian Paul <bri...@vmware.com> > > Do you need me to push this for you? I don't have push access, so that would be great. Thanks. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2] util: remove redundant check for the __clang__ macro
Clang defines __GNUC__ macro, so one doesn't need to check __clang__ macro in this particular case. v2: added comment as per Brian Paul's suggestion --- src/util/macros.h | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/util/macros.h b/src/util/macros.h index 432d513930..e3c785af50 100644 --- a/src/util/macros.h +++ b/src/util/macros.h @@ -137,8 +137,9 @@ do { \ #endif /* Forced function inlining */ +/* Note: Clang also sets __GNUC__ (see other cases below) */ #ifndef ALWAYS_INLINE -# if defined(__GNUC__) || defined(__clang__) +# if defined(__GNUC__) #define ALWAYS_INLINE inline __attribute__((always_inline)) # elif defined(_MSC_VER) #define ALWAYS_INLINE __forceinline -- 2.14.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] nir: remove the abs call in is_neg_power_of_two
val->i32[swizzle[i]] is guaranteed to have non-positive value before the __is_power_of_two call, so unary minus is equivalent to abs in this case. --- src/compiler/nir/nir_search_helpers.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/compiler/nir/nir_search_helpers.h b/src/compiler/nir/nir_search_helpers.h index 2e3bd137d6..66e1546ae6 100644 --- a/src/compiler/nir/nir_search_helpers.h +++ b/src/compiler/nir/nir_search_helpers.h @@ -80,7 +80,7 @@ is_neg_power_of_two(nir_alu_instr *instr, unsigned src, unsigned num_components, case nir_type_int: if (val->i32[swizzle[i]] > 0) return false; - if (!__is_power_of_two(abs(val->i32[swizzle[i]]))) + if (!__is_power_of_two(-val->i32[swizzle[i]])) return false; break; default: -- 2.14.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] util: remove redundant check for the __clang__ macro
In this file there are similar cases with macros PUBLIC, USED and ATTRIBUTE_NOINLINE, before defining which as __attribute__(...), code only checks for __GNUC__. Should I add comments there as well? 2018-02-05 22:51 GMT+02:00 Brian Paul <bri...@vmware.com>: > On 02/05/2018 01:44 PM, Vlad Golovkin wrote: >> >> Clang defines __GNUC__ macro, so one doesn't need to check __clang__ >> macro in this particular case. > > > Perhaps mention that in a comment below so there's no confusion. > > -Brian > > >> --- >> src/util/macros.h | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/src/util/macros.h b/src/util/macros.h >> index 432d513930..d36ca095d5 100644 >> --- a/src/util/macros.h >> +++ b/src/util/macros.h >> @@ -138,7 +138,7 @@ do { \ >> /* Forced function inlining */ >> #ifndef ALWAYS_INLINE >> -# if defined(__GNUC__) || defined(__clang__) >> +# if defined(__GNUC__) >> #define ALWAYS_INLINE inline __attribute__((always_inline)) >> # elif defined(_MSC_VER) >> #define ALWAYS_INLINE __forceinline >> > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] util: remove redundant check for the __clang__ macro
Clang defines __GNUC__ macro, so one doesn't need to check __clang__ macro in this particular case. --- src/util/macros.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/util/macros.h b/src/util/macros.h index 432d513930..d36ca095d5 100644 --- a/src/util/macros.h +++ b/src/util/macros.h @@ -138,7 +138,7 @@ do { \ /* Forced function inlining */ #ifndef ALWAYS_INLINE -# if defined(__GNUC__) || defined(__clang__) +# if defined(__GNUC__) #define ALWAYS_INLINE inline __attribute__((always_inline)) # elif defined(_MSC_VER) #define ALWAYS_INLINE __forceinline -- 2.14.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] r600: remove unnecessary NULL check in r600_shader_select
r600_shader_select is always called through the macro SELECT_SHADER_OR_FAIL, which never passes NULL pointers as parameter 'dirty'. --- src/gallium/drivers/r600/r600_state_common.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/src/gallium/drivers/r600/r600_state_common.c b/src/gallium/drivers/r600/r600_state_common.c index 8ace7793f0..51c4c6dc30 100644 --- a/src/gallium/drivers/r600/r600_state_common.c +++ b/src/gallium/drivers/r600/r600_state_common.c @@ -803,8 +803,7 @@ static int r600_shader_select(struct pipe_context *ctx, sel->num_shaders++; } - if (dirty) - *dirty = true; + *dirty = true; shader->next_variant = sel->current; sel->current = shader; -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] glcpp: simplify the check for hexadecimal number
2017-06-02 17:59 GMT+03:00 Emil Velikov <emil.l.veli...@gmail.com>: > On 2 June 2017 at 14:56, Vlad Golovkin <vlad.golovkin.m...@gmail.com> wrote: >> -- Forwarded message ------ >> From: Vlad Golovkin <vlad.golovkin.m...@gmail.com> >> Date: 2017-06-02 16:55 GMT+03:00 >> Subject: Re: [Mesa-dev] [PATCH] glcpp: simplify the check for hexadecimal >> number >> To: Emil Velikov <emil.l.veli...@gmail.com> >> >> >> 2017-06-02 16:07 GMT+03:00 Emil Velikov <emil.l.veli...@gmail.com>: >>> Hi Vlad, >>> >>> Welcome to Mesa. >>> >>> Have you don't any benchmarking on this patch via something like shader-db >>> [1]? >>> Mentioning the results in the commit message is encouraged. >> >> Hi Emil, >> >> If I understand correctly shader-db supports only i965 and radeonsi, >> or am I missing something? >> I would be glad to be proven wrong, because I only have old r600 gpu. >> > In-tree you have helper scripts for i965, nouveau, radeonsi and > freedreno, but one can run shader-db on any driver. > Hi Emil, Sorry for not responding for so long, I spend a good amount of time trying to collect enough shaders to get reasonable measurements. I added some more shaders to my local copy of shader-db, mainly from: some Firefox WebGL demos, Pcsx2 emulator, Unvanquished, Minetest, Widelands, Krita and Godot Engine. During testing I accidentally discovered the bug in this little piece of code that I was looking at for quite some time - strcmp doesn't check for "0X" hex prefix, only for "0x". I thought that this was because of some flex "magic", which was providing hex numbers already lowercased or something like that, but this was not the case. Right now if you pass shader with the following structure: #if 0X1 // or any hex number with the 0X prefix // some code #endif the code between #if and #endif gets removed because the check for hex nubmer fails which results in strtoll being called with the base 8 and after encountering the 'X' char the strtoll returns 0. I think that this bug was undiscovered for so long because hardly anyone writes hex numbers with the 0X prefix in the shaders, especially in the preprocessor directives, for example there are no such cases in shader-db. Letting strtoll detect the base makes this limitation go away and also makes code easier on the eyes (and even marginally faster). perf results: master branch 11,978919700 s old patch 11,963108442 s strtoll with base 0 11,941502948 s I can send this new strtoll patch for review. >>> >>> On 2 June 2017 at 13:37, Vlad Golovkin <vlad.golovkin.m...@gmail.com> wrote: >>>> This will make it possible to remove unnecessary strlen and strncmp >>>> calls. >>>> --- >>>> src/compiler/glsl/glcpp/glcpp-parse.y | 10 ++ >>>> 1 file changed, 6 insertions(+), 4 deletions(-) >>>> >>>> diff --git a/src/compiler/glsl/glcpp/glcpp-parse.y >>>> b/src/compiler/glsl/glcpp/glcpp-parse.y >>>> index fe211a0f0b..8cdd4c9d5b 100644 >>>> --- a/src/compiler/glsl/glcpp/glcpp-parse.y >>>> +++ b/src/compiler/glsl/glcpp/glcpp-parse.y >>>> @@ -450,10 +450,12 @@ control_line_error: >>>> >>>> integer_constant: >>>> INTEGER_STRING { >>>> - if (strlen ($1) >= 3 && strncmp ($1, "0x", 2) == 0) { >>>> - $$ = strtoll ($1 + 2, NULL, 16); >>>> - } else if ($1[0] == '0') { >>>> - $$ = strtoll ($1, NULL, 8); >>>> + if ($1[0] == '0') { >>>> + if ($1[1] == 'x' && $1[2] != '\0') { >>>> + $$ = strtoll ($1 + 2, NULL, 16); >>>> + } else { >>>> + $$ = strtoll ($1, NULL, 8); >>>> + } >>>> } else { >>>> $$ = strtoll ($1, NULL, 10); >>>> } >>> Food for thought: >>> >>> Any idea why we'd want to handle any of these details ourselves? >>> strtoll(..., /*base*/ 0) is smart enough, plus it should be better >>> optimised. >> This is a good approach, it will allow to write just one simple >> function call instead of all these checks. >> For some reason i thought that calling strtoll(str, NULL, 0) will make >> it autodetect the base - from 2 to 36, when in reality it basically >> tries to match bases 8, 10 and 16. >>> >>> Another possibility is to swap the INTEGER_STRING token for >>> {DEC,OCT,HEX}... ones each one calling strtoll() with correct base. >> Advantage of this approach, however, is that in this case one can just >> use custom string to num functions that avoid the overhead of strtoll, >> but then again it is better to benchmark before drawing any >> conclusions. > Indeed. Apart from shader-db, shader intensive games may also help - > Deus Ex perhaps? > > -Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] Fwd: [PATCH] glcpp: simplify the check for hexadecimal number
-- Forwarded message -- From: Vlad Golovkin <vlad.golovkin.m...@gmail.com> Date: 2017-06-02 16:55 GMT+03:00 Subject: Re: [Mesa-dev] [PATCH] glcpp: simplify the check for hexadecimal number To: Emil Velikov <emil.l.veli...@gmail.com> 2017-06-02 16:07 GMT+03:00 Emil Velikov <emil.l.veli...@gmail.com>: > Hi Vlad, > > Welcome to Mesa. > > Have you don't any benchmarking on this patch via something like shader-db > [1]? > Mentioning the results in the commit message is encouraged. Hi Emil, If I understand correctly shader-db supports only i965 and radeonsi, or am I missing something? I would be glad to be proven wrong, because I only have old r600 gpu. > > On 2 June 2017 at 13:37, Vlad Golovkin <vlad.golovkin.m...@gmail.com> wrote: >> This will make it possible to remove unnecessary strlen and strncmp >> calls. >> --- >> src/compiler/glsl/glcpp/glcpp-parse.y | 10 ++ >> 1 file changed, 6 insertions(+), 4 deletions(-) >> >> diff --git a/src/compiler/glsl/glcpp/glcpp-parse.y >> b/src/compiler/glsl/glcpp/glcpp-parse.y >> index fe211a0f0b..8cdd4c9d5b 100644 >> --- a/src/compiler/glsl/glcpp/glcpp-parse.y >> +++ b/src/compiler/glsl/glcpp/glcpp-parse.y >> @@ -450,10 +450,12 @@ control_line_error: >> >> integer_constant: >> INTEGER_STRING { >> - if (strlen ($1) >= 3 && strncmp ($1, "0x", 2) == 0) { >> - $$ = strtoll ($1 + 2, NULL, 16); >> - } else if ($1[0] == '0') { >> - $$ = strtoll ($1, NULL, 8); >> + if ($1[0] == '0') { >> + if ($1[1] == 'x' && $1[2] != '\0') { >> + $$ = strtoll ($1 + 2, NULL, 16); >> + } else { >> + $$ = strtoll ($1, NULL, 8); >> + } >> } else { >> $$ = strtoll ($1, NULL, 10); >> } > Food for thought: > > Any idea why we'd want to handle any of these details ourselves? > strtoll(..., /*base*/ 0) is smart enough, plus it should be better optimised. This is a good approach, it will allow to write just one simple function call instead of all these checks. For some reason i thought that calling strtoll(str, NULL, 0) will make it autodetect the base - from 2 to 36, when in reality it basically tries to match bases 8, 10 and 16. > > Another possibility is to swap the INTEGER_STRING token for > {DEC,OCT,HEX}... ones each one calling strtoll() with correct base. Advantage of this approach, however, is that in this case one can just use custom string to num functions that avoid the overhead of strtoll, but then again it is better to benchmark before drawing any conclusions. > > -Emil > > [1] https://cgit.freedesktop.org/mesa/shader-db/ ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] glcpp: simplify the check for hexadecimal number
This will make it possible to remove unnecessary strlen and strncmp calls. --- src/compiler/glsl/glcpp/glcpp-parse.y | 10 ++ 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/src/compiler/glsl/glcpp/glcpp-parse.y b/src/compiler/glsl/glcpp/glcpp-parse.y index fe211a0f0b..8cdd4c9d5b 100644 --- a/src/compiler/glsl/glcpp/glcpp-parse.y +++ b/src/compiler/glsl/glcpp/glcpp-parse.y @@ -450,10 +450,12 @@ control_line_error: integer_constant: INTEGER_STRING { - if (strlen ($1) >= 3 && strncmp ($1, "0x", 2) == 0) { - $$ = strtoll ($1 + 2, NULL, 16); - } else if ($1[0] == '0') { - $$ = strtoll ($1, NULL, 8); + if ($1[0] == '0') { + if ($1[1] == 'x' && $1[2] != '\0') { + $$ = strtoll ($1 + 2, NULL, 16); + } else { + $$ = strtoll ($1, NULL, 8); + } } else { $$ = strtoll ($1, NULL, 10); } -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/4] glx: use memcmp instead of strncmp
2017-05-31 2:42 GMT+03:00 Ian Romanick <i...@freedesktop.org>: > Same comment here as on patch 3. > The size of glxextensions.o is decreased by 8 bytes in debug build (I am not sure why), so in that regard there is next to no benefit. But the main reason behind this change was not the size but the fact that string lengths were already being compared beforehand so why not avoid redundant checks in strncmp? > On 05/30/2017 03:45 PM, Vlad Golovkin wrote: >> --- >> src/glx/glxextensions.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/src/glx/glxextensions.c b/src/glx/glxextensions.c >> index 22b078ce48..c0640d0b73 100644 >> --- a/src/glx/glxextensions.c >> +++ b/src/glx/glxextensions.c >> @@ -366,7 +366,7 @@ set_glx_extension(const struct extension_info *ext, >> >> for (i = 0; ext[i].name != NULL; i++) { >>if ((name_len == ext[i].name_len) >> - && (strncmp(ext[i].name, name, name_len) == 0)) { >> + && (memcmp(ext[i].name, name, name_len) == 0)) { >> if (state) { >> SET_BIT(supported, ext[i].bit); >> } >> > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/4] glsl: trivial change of the file extension matching
2017-05-31 2:39 GMT+03:00 Ian Romanick <i...@freedesktop.org>: > On 05/30/2017 03:45 PM, Vlad Golovkin wrote: >> --- >> src/compiler/glsl/standalone.cpp | 17 ++--- >> 1 file changed, 10 insertions(+), 7 deletions(-) >> >> diff --git a/src/compiler/glsl/standalone.cpp >> b/src/compiler/glsl/standalone.cpp >> index 52554bb92a..b9cee23642 100644 >> --- a/src/compiler/glsl/standalone.cpp >> +++ b/src/compiler/glsl/standalone.cpp >> @@ -459,19 +459,22 @@ standalone_compile_shader(const struct >> standalone_options *_options, >>if (len < 6) >> goto fail; >> >> - const char *const ext = & files[i][len - 5]; >> + const char *ext = & files[i][len - 5]; >>/* TODO add support to read a .shader_test */ >> - if (strncmp(".vert", ext, 5) == 0 || strncmp(".glsl", ext, 5) == 0) >> + if (*ext != '.') >> + goto fail; >> + ++ext; >> + if (memcmp("vert", ext, 4) == 0 || memcmp("glsl", ext, 4) == 0) > > This is not the same. If ext points at a NUL character that is, say, at > the end of page, you don't get to look at the next 3 bytes, and memcmp > may well do that. There is no guarantee that memcmp will not access > beyond the first mismatched byte. Maybe I am missing something but I think that check above ensures that files[i] is at least 6 chars long, so there are always 4 non-NUL bytes to compare. > >>shader->Type = GL_VERTEX_SHADER; >> - else if (strncmp(".tesc", ext, 5) == 0) >> + else if (memcmp("tesc", ext, 4) == 0) >>shader->Type = GL_TESS_CONTROL_SHADER; >> - else if (strncmp(".tese", ext, 5) == 0) >> + else if (memcmp("tese", ext, 4) == 0) >>shader->Type = GL_TESS_EVALUATION_SHADER; >> - else if (strncmp(".geom", ext, 5) == 0) >> + else if (memcmp("geom", ext, 4) == 0) >>shader->Type = GL_GEOMETRY_SHADER; >> - else if (strncmp(".frag", ext, 5) == 0) >> + else if (memcmp("frag", ext, 4) == 0) >>shader->Type = GL_FRAGMENT_SHADER; >> - else if (strncmp(".comp", ext, 5) == 0) >> + else if (memcmp("comp", ext, 4) == 0) >> shader->Type = GL_COMPUTE_SHADER; >>else >> goto fail; >> > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 0/4] use memcmp instead of strncmp when string lengths are known
This series contains a couple of trivial changes, mostly places where the code looks like this: if (len1 == len2 && (strncmp(str1, str2, len1)) ... It is better to use memcmp here because there is no need to check for the terminating null byte Vlad Golovkin (4): glsl: simplify +INF check glsl: trivial change of the file extension matching util: use memcmp instead of strncmp glx: use memcmp instead of strncmp src/compiler/glsl/s_expression.cpp | 2 +- src/compiler/glsl/standalone.cpp | 17 ++--- src/glx/glxextensions.c| 2 +- src/util/debug.c | 2 +- 4 files changed, 13 insertions(+), 10 deletions(-) -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/4] glsl: trivial change of the file extension matching
--- src/compiler/glsl/standalone.cpp | 17 ++--- 1 file changed, 10 insertions(+), 7 deletions(-) diff --git a/src/compiler/glsl/standalone.cpp b/src/compiler/glsl/standalone.cpp index 52554bb92a..b9cee23642 100644 --- a/src/compiler/glsl/standalone.cpp +++ b/src/compiler/glsl/standalone.cpp @@ -459,19 +459,22 @@ standalone_compile_shader(const struct standalone_options *_options, if (len < 6) goto fail; - const char *const ext = & files[i][len - 5]; + const char *ext = & files[i][len - 5]; /* TODO add support to read a .shader_test */ - if (strncmp(".vert", ext, 5) == 0 || strncmp(".glsl", ext, 5) == 0) + if (*ext != '.') + goto fail; + ++ext; + if (memcmp("vert", ext, 4) == 0 || memcmp("glsl", ext, 4) == 0) shader->Type = GL_VERTEX_SHADER; - else if (strncmp(".tesc", ext, 5) == 0) + else if (memcmp("tesc", ext, 4) == 0) shader->Type = GL_TESS_CONTROL_SHADER; - else if (strncmp(".tese", ext, 5) == 0) + else if (memcmp("tese", ext, 4) == 0) shader->Type = GL_TESS_EVALUATION_SHADER; - else if (strncmp(".geom", ext, 5) == 0) + else if (memcmp("geom", ext, 4) == 0) shader->Type = GL_GEOMETRY_SHADER; - else if (strncmp(".frag", ext, 5) == 0) + else if (memcmp("frag", ext, 4) == 0) shader->Type = GL_FRAGMENT_SHADER; - else if (strncmp(".comp", ext, 5) == 0) + else if (memcmp("comp", ext, 4) == 0) shader->Type = GL_COMPUTE_SHADER; else goto fail; -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/4] glsl: simplify +INF check
--- src/compiler/glsl/s_expression.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/compiler/glsl/s_expression.cpp b/src/compiler/glsl/s_expression.cpp index f82e155a6b..0e05c4bba7 100644 --- a/src/compiler/glsl/s_expression.cpp +++ b/src/compiler/glsl/s_expression.cpp @@ -69,7 +69,7 @@ read_atom(void *ctx, const char *, char *_buffer) // Check for the special symbol '+INF', which means +Infinity. Note: C99 // requires strtof to parse '+INF' as +Infinity, but we still support some // non-C99-compliant compilers (e.g. MSVC). - if (n == 4 && strncmp(src, "+INF", 4) == 0) { + if (n == 4 && memcmp(src, "+INF", 4) == 0) { expr = new(ctx) s_float(INFINITY); } else { // Check if the atom is a number. -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 4/4] glx: use memcmp instead of strncmp
--- src/glx/glxextensions.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/glx/glxextensions.c b/src/glx/glxextensions.c index 22b078ce48..c0640d0b73 100644 --- a/src/glx/glxextensions.c +++ b/src/glx/glxextensions.c @@ -366,7 +366,7 @@ set_glx_extension(const struct extension_info *ext, for (i = 0; ext[i].name != NULL; i++) { if ((name_len == ext[i].name_len) - && (strncmp(ext[i].name, name, name_len) == 0)) { + && (memcmp(ext[i].name, name, name_len) == 0)) { if (state) { SET_BIT(supported, ext[i].bit); } -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/4] util: use memcmp instead of strncmp
--- src/util/debug.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/util/debug.c b/src/util/debug.c index 98b1853325..78041d414f 100644 --- a/src/util/debug.c +++ b/src/util/debug.c @@ -42,7 +42,7 @@ parse_debug_string(const char *debug, for (; n = strcspn(s, ", "), *s; s += MAX2(1, n)) { if (strlen(control->string) == n && - !strncmp(control->string, s, n)) + !memcmp(control->string, s, n)) flag |= control->flag; } } -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] util: make set's deleted_key_value declaration consistent with hash table one
This also silences following clang warnings: no previous extern declaration for non-static variable 'deleted_key' [-Werror,-Wmissing-variable-declarations] const void *deleted_key = _key_value; ^ no previous extern declaration for non-static variable 'deleted_key_value' [-Werror,-Wmissing-variable-declarations] uint32_t deleted_key_value; ^ --- src/util/set.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/util/set.c b/src/util/set.c index 99abefd063..3925066395 100644 --- a/src/util/set.c +++ b/src/util/set.c @@ -45,8 +45,8 @@ * free to avoid exponential performance degradation as the hash table fills */ -uint32_t deleted_key_value; -const void *deleted_key = _key_value; +static const uint32_t deleted_key_value; +static const void *deleted_key = _key_value; static const struct { uint32_t max_entries, size, rehash; -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev