Re: [Mesa-dev] [PATCH] mesa/math: Allocate memory for GLmatrix elements and its inverse contiguously

2018-04-19 Thread Thomas Helland
2018-04-19 20:08 GMT+02:00 Vlad Golovkin <vlad.golovkin.m...@gmail.com>:
> -- Forwarded message --
> From: Vlad Golovkin <vlad.golovkin.m...@gmail.com>
> Date: 2018-04-19 21:06 GMT+03:00
> Subject: Re: [Mesa-dev] [PATCH] mesa/math: Allocate memory for
> GLmatrix elements and its inverse contiguously
> To: Thomas Helland <thomashellan...@gmail.com>
>
>
> 2018-04-17 8:55 GMT+03:00 Thomas Helland <thomashellan...@gmail.com>:
>> Hi, and thanks for the patch =)
>>
>> Have you done any performance testing on this to verify it
>> gives us a speedup of any kind? I'm asking because it seems like
>> this might be something that a decent compiler should be able to do.
>> Performance related patches, at least in core mesa, usually have
>> some justification with benchmark numbers in the commit message.
>
> Hi,
> I examined the resulting assembly for these 3 functions and it turns
> out that compiler wasn't merging these two blocks of memory into one
> (which compiler does that?).
> gcc tried to unroll memcpys to a series of movs which may seem to
> partially defeat the purpose of this patch, but after copying the
> block corresponding to m->m it had to switch destination and source
> registers to the next block resulting in 2 wasted movs.
> As a result we can save malloc and free call (in _math_matrix_ctr and
> _math_matrix_dtr) and 2 movs (when compiler tries to avoid memcpy -
> best case) or 1 memcpy call (in the worst case). It may seem that 2nd
> malloc can place m->inv in memory right after m->m but: 1) compiler
> can't rely on that behaviour 2) allocator will insert some private
> data before each block leading to more cache misses.
> I made some testing with Torcs and Yo Frankie blender game and
> according to perf in Yo Frankie _math_matrix_copy overhead reduced by
> 0.03% - 0.04% while Torcs didn't see any improvement.
>

Good analysis! While the gains are not huge, its probably worthwhile.
With some of the comments adressed this has my RB.
I'll pull it down this weekend, and add the comments if you don't
beat me to it, and then I'll push with my RB once we are past the
18.1 branching. Thanks for the patch =)

> Sorry for the duplicate emails.
>
>> Some style comments below
>>
>> 2018-04-17 1:03 GMT+02:00 Vlad Golovkin <vlad.golovkin.m...@gmail.com>:
>>> When GLmatrix elements and its inverse are stored contiguously in memory it 
>>> is possible to
>>> allocate, free and copy these fields with 1 function call instead of 2.
>>> ---
>>>  src/mesa/math/m_matrix.c | 15 +--
>>>  1 file changed, 9 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/src/mesa/math/m_matrix.c b/src/mesa/math/m_matrix.c
>>> index 57a49533de..4ab78a1fb3 100644
>>> --- a/src/mesa/math/m_matrix.c
>>> +++ b/src/mesa/math/m_matrix.c
>>> @@ -1438,8 +1438,7 @@ _math_matrix_is_dirty( const GLmatrix *m )
>>>  void
>>>  _math_matrix_copy( GLmatrix *to, const GLmatrix *from )
>>>  {
>>> -   memcpy(to->m, from->m, 16 * sizeof(GLfloat));
>>> -   memcpy(to->inv, from->inv, 16 * sizeof(GLfloat));
>>> +   memcpy(to->m, from->m, 16 * 2 * sizeof(GLfloat));
>>> to->flags = from->flags;
>>> to->type = from->type;
>>>  }
>>> @@ -1470,12 +1469,17 @@ _math_matrix_loadf( GLmatrix *mat, const GLfloat *m 
>>> )
>>>  void
>>>  _math_matrix_ctr( GLmatrix *m )
>>>  {
>>> -   m->m = _mesa_align_malloc( 16 * sizeof(GLfloat), 16 );
>>> +   m->m = _mesa_align_malloc( 16 * 2 * sizeof(GLfloat), 16 );
>>> if (m->m)
>>> +   {
>>
>> Our style guides says to keep the curly bracket after an if on the same line.
>>
>>> +  m->inv = m->m + 16;
>>>memcpy( m->m, Identity, sizeof(Identity) );
>>> -   m->inv = _mesa_align_malloc( 16 * sizeof(GLfloat), 16 );
>>> -   if (m->inv)
>>>memcpy( m->inv, Identity, sizeof(Identity) );
>>> +   }
>>> +   else
>>> +   {
>>
>> } else {
>>
>> Although I see that this file defaults to;
>>
>> {
>> else {
>>
>> for some reason. Feel free to follow existing style, or adjust to my 
>> comments.
>> Also, if we want to do this change it deserves a comment in the source.
>>> +  m->inv = NULL;
>>> +   }
>>> m->type = MATRIX_IDENTITY;
>>> m->flags = 0;
>>>  }
>>> @@ -1493,7 +1497,6 @@ _math_matrix_dtr( GLmatrix *m )
>>> _mesa_align_free( m->m );
>>> m->m = NULL;
>>>
>>> -   _mesa_align_free( m->inv );
>>> m->inv = NULL;
>>>  }
>>>
>>> --
>>> 2.14.1
>>>
>>> ___
>>> mesa-dev mailing list
>>> mesa-dev@lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa/math: Allocate memory for GLmatrix elements and its inverse contiguously

2018-04-17 Thread Ian Romanick
On 04/16/2018 10:55 PM, Thomas Helland wrote:
> Hi, and thanks for the patch =)
> 
> Have you done any performance testing on this to verify it
> gives us a speedup of any kind? I'm asking because it seems like
> this might be something that a decent compiler should be able to do.
> Performance related patches, at least in core mesa, usually have
> some justification with benchmark numbers in the commit message.
> Some style comments below
> 
> 2018-04-17 1:03 GMT+02:00 Vlad Golovkin :
>> When GLmatrix elements and its inverse are stored contiguously in memory it 
>> is possible to
>> allocate, free and copy these fields with 1 function call instead of 2.
>> ---
>>  src/mesa/math/m_matrix.c | 15 +--
>>  1 file changed, 9 insertions(+), 6 deletions(-)
>>
>> diff --git a/src/mesa/math/m_matrix.c b/src/mesa/math/m_matrix.c
>> index 57a49533de..4ab78a1fb3 100644
>> --- a/src/mesa/math/m_matrix.c
>> +++ b/src/mesa/math/m_matrix.c
>> @@ -1438,8 +1438,7 @@ _math_matrix_is_dirty( const GLmatrix *m )
>>  void
>>  _math_matrix_copy( GLmatrix *to, const GLmatrix *from )
>>  {
>> -   memcpy(to->m, from->m, 16 * sizeof(GLfloat));
>> -   memcpy(to->inv, from->inv, 16 * sizeof(GLfloat));
>> +   memcpy(to->m, from->m, 16 * 2 * sizeof(GLfloat));
>> to->flags = from->flags;
>> to->type = from->type;
>>  }
>> @@ -1470,12 +1469,17 @@ _math_matrix_loadf( GLmatrix *mat, const GLfloat *m )
>>  void
>>  _math_matrix_ctr( GLmatrix *m )
>>  {
>> -   m->m = _mesa_align_malloc( 16 * sizeof(GLfloat), 16 );
>> +   m->m = _mesa_align_malloc( 16 * 2 * sizeof(GLfloat), 16 );
>> if (m->m)
>> +   {
> 
> Our style guides says to keep the curly bracket after an if on the same line.
> 
>> +  m->inv = m->m + 16;
>>memcpy( m->m, Identity, sizeof(Identity) );
>> -   m->inv = _mesa_align_malloc( 16 * sizeof(GLfloat), 16 );
>> -   if (m->inv)
>>memcpy( m->inv, Identity, sizeof(Identity) );
>> +   }
>> +   else
>> +   {
> 
> } else {
> 
> Although I see that this file defaults to;
> 
> {
> else {
> 
> for some reason. Feel free to follow existing style, or adjust to my comments.

I think that's because this file is really, really old, and it pre-dates
the current style.

> Also, if we want to do this change it deserves a comment in the source.
>> +  m->inv = NULL;
>> +   }
>> m->type = MATRIX_IDENTITY;
>> m->flags = 0;
>>  }
>> @@ -1493,7 +1497,6 @@ _math_matrix_dtr( GLmatrix *m )
>> _mesa_align_free( m->m );
>> m->m = NULL;
>>
>> -   _mesa_align_free( m->inv );
>> m->inv = NULL;
>>  }
>>
>> --
>> 2.14.1
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa/math: Allocate memory for GLmatrix elements and its inverse contiguously

2018-04-16 Thread Thomas Helland
Hi, and thanks for the patch =)

Have you done any performance testing on this to verify it
gives us a speedup of any kind? I'm asking because it seems like
this might be something that a decent compiler should be able to do.
Performance related patches, at least in core mesa, usually have
some justification with benchmark numbers in the commit message.
Some style comments below

2018-04-17 1:03 GMT+02:00 Vlad Golovkin :
> When GLmatrix elements and its inverse are stored contiguously in memory it 
> is possible to
> allocate, free and copy these fields with 1 function call instead of 2.
> ---
>  src/mesa/math/m_matrix.c | 15 +--
>  1 file changed, 9 insertions(+), 6 deletions(-)
>
> diff --git a/src/mesa/math/m_matrix.c b/src/mesa/math/m_matrix.c
> index 57a49533de..4ab78a1fb3 100644
> --- a/src/mesa/math/m_matrix.c
> +++ b/src/mesa/math/m_matrix.c
> @@ -1438,8 +1438,7 @@ _math_matrix_is_dirty( const GLmatrix *m )
>  void
>  _math_matrix_copy( GLmatrix *to, const GLmatrix *from )
>  {
> -   memcpy(to->m, from->m, 16 * sizeof(GLfloat));
> -   memcpy(to->inv, from->inv, 16 * sizeof(GLfloat));
> +   memcpy(to->m, from->m, 16 * 2 * sizeof(GLfloat));
> to->flags = from->flags;
> to->type = from->type;
>  }
> @@ -1470,12 +1469,17 @@ _math_matrix_loadf( GLmatrix *mat, const GLfloat *m )
>  void
>  _math_matrix_ctr( GLmatrix *m )
>  {
> -   m->m = _mesa_align_malloc( 16 * sizeof(GLfloat), 16 );
> +   m->m = _mesa_align_malloc( 16 * 2 * sizeof(GLfloat), 16 );
> if (m->m)
> +   {

Our style guides says to keep the curly bracket after an if on the same line.

> +  m->inv = m->m + 16;
>memcpy( m->m, Identity, sizeof(Identity) );
> -   m->inv = _mesa_align_malloc( 16 * sizeof(GLfloat), 16 );
> -   if (m->inv)
>memcpy( m->inv, Identity, sizeof(Identity) );
> +   }
> +   else
> +   {

} else {

Although I see that this file defaults to;

{
else {

for some reason. Feel free to follow existing style, or adjust to my comments.
Also, if we want to do this change it deserves a comment in the source.
> +  m->inv = NULL;
> +   }
> m->type = MATRIX_IDENTITY;
> m->flags = 0;
>  }
> @@ -1493,7 +1497,6 @@ _math_matrix_dtr( GLmatrix *m )
> _mesa_align_free( m->m );
> m->m = NULL;
>
> -   _mesa_align_free( m->inv );
> m->inv = NULL;
>  }
>
> --
> 2.14.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev