Re: [Mesa-dev] [PATCH 00/15] GLSL memory allocation rework for faster compilation

2016-10-31 Thread Marek Olšák
On Fri, Oct 28, 2016 at 9:59 AM, Tapani Pälli  wrote:
> On 10/27/2016 09:16 PM, Marek Olšák wrote:
>>
>> On Fri, Oct 21, 2016 at 5:29 PM, Tapani Pälli 
>> wrote:
>>>
>>> On 10/21/2016 04:57 PM, Eero Tamminen wrote:

 Hi,

 On 21.10.2016 14:07, Tapani Pälli wrote:
>
> I did run some valgrind comparisons with gfxbench4, your branch against
> Mesa master. I did not spot anything obvious related to ralloc.
>
> On i965 there's a huge load of invalid writes's and read's in general
> but this happens on master as well so maybe these are not 'valid hits'.


 I don't see those with master from 3 weeks ago.  Did you remember to
 compile libdrm with Valgrind headers present, so that buffer allocs are
 annotated for Valgrind?

>>> Yeah, I will need to double-check that. I'll try again on Monday and we
>>> can
>>> investigate the results together, I believe some were for Mesa core as
>>> well
>>> related to compressed texture handling.
>>
>> I guess the Sunday party was so huge that nobody was productive on Monday?
>> :)
>>
>> Seriously, it's Thursday, and there is no known issue with my latest
>> branch which contains all fixes from you guys. So there is no reason
>> to wait anymore.
>
>
> Sorry, I haven't had time to sit down with Eero with this ... but I did some
> runs again and the errors I saw are in Mesa master on HSW and SKL when
> running gfxbench4 'car chase' (which also has render artifacts, bug 96743).
> JP has seen some of the same and I believe valgrind traces in another bug
> 98455 look very similar to what I saw so it's now about these changes but
> something we have going on anyway.
>
> For what comes to this series, I do hope someone else on i965 also tests
> things, I just happened to have special interest as it can help on making
> optimization passes which is very much appreciated!

No problem. I just pushed the series. If somebody needs more testing,
the person can do it on master.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 00/15] GLSL memory allocation rework for faster compilation

2016-10-28 Thread Tapani Pälli

On 10/27/2016 09:16 PM, Marek Olšák wrote:

On Fri, Oct 21, 2016 at 5:29 PM, Tapani Pälli  wrote:

On 10/21/2016 04:57 PM, Eero Tamminen wrote:

Hi,

On 21.10.2016 14:07, Tapani Pälli wrote:

I did run some valgrind comparisons with gfxbench4, your branch against
Mesa master. I did not spot anything obvious related to ralloc.

On i965 there's a huge load of invalid writes's and read's in general
but this happens on master as well so maybe these are not 'valid hits'.


I don't see those with master from 3 weeks ago.  Did you remember to
compile libdrm with Valgrind headers present, so that buffer allocs are
annotated for Valgrind?


Yeah, I will need to double-check that. I'll try again on Monday and we can
investigate the results together, I believe some were for Mesa core as well
related to compressed texture handling.

I guess the Sunday party was so huge that nobody was productive on Monday? :)

Seriously, it's Thursday, and there is no known issue with my latest
branch which contains all fixes from you guys. So there is no reason
to wait anymore.


Sorry, I haven't had time to sit down with Eero with this ... but I did 
some runs again and the errors I saw are in Mesa master on HSW and SKL 
when running gfxbench4 'car chase' (which also has render artifacts, bug 
96743). JP has seen some of the same and I believe valgrind traces in 
another bug 98455 look very similar to what I saw so it's now about 
these changes but something we have going on anyway.


For what comes to this series, I do hope someone else on i965 also tests 
things, I just happened to have special interest as it can help on 
making optimization passes which is very much appreciated!


Thanks;

// Tapani

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 00/15] GLSL memory allocation rework for faster compilation

2016-10-27 Thread Marek Olšák
On Fri, Oct 21, 2016 at 5:29 PM, Tapani Pälli  wrote:
> On 10/21/2016 04:57 PM, Eero Tamminen wrote:
>>
>> Hi,
>>
>> On 21.10.2016 14:07, Tapani Pälli wrote:
>>>
>>> I did run some valgrind comparisons with gfxbench4, your branch against
>>> Mesa master. I did not spot anything obvious related to ralloc.
>>>
>>> On i965 there's a huge load of invalid writes's and read's in general
>>> but this happens on master as well so maybe these are not 'valid hits'.
>>
>>
>> I don't see those with master from 3 weeks ago.  Did you remember to
>> compile libdrm with Valgrind headers present, so that buffer allocs are
>> annotated for Valgrind?
>>
>
> Yeah, I will need to double-check that. I'll try again on Monday and we can
> investigate the results together, I believe some were for Mesa core as well
> related to compressed texture handling.

I guess the Sunday party was so huge that nobody was productive on Monday? :)

Seriously, it's Thursday, and there is no known issue with my latest
branch which contains all fixes from you guys. So there is no reason
to wait anymore.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 00/15] GLSL memory allocation rework for faster compilation

2016-10-21 Thread Tapani Pälli

On 10/21/2016 04:57 PM, Eero Tamminen wrote:

Hi,

On 21.10.2016 14:07, Tapani Pälli wrote:

I did run some valgrind comparisons with gfxbench4, your branch against
Mesa master. I did not spot anything obvious related to ralloc.

On i965 there's a huge load of invalid writes's and read's in general
but this happens on master as well so maybe these are not 'valid hits'.


I don't see those with master from 3 weeks ago.  Did you remember to 
compile libdrm with Valgrind headers present, so that buffer allocs 
are annotated for Valgrind?




Yeah, I will need to double-check that. I'll try again on Monday and we 
can investigate the results together, I believe some were for Mesa core 
as well related to compressed texture handling.





There's also some leaks both with master and your branch but your branch
has less, so it seems to fix some things.



- Eero


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 00/15] GLSL memory allocation rework for faster compilation

2016-10-21 Thread Eero Tamminen

Hi,

On 21.10.2016 14:07, Tapani Pälli wrote:

I did run some valgrind comparisons with gfxbench4, your branch against
Mesa master. I did not spot anything obvious related to ralloc.

On i965 there's a huge load of invalid writes's and read's in general
but this happens on master as well so maybe these are not 'valid hits'.


I don't see those with master from 3 weeks ago.  Did you remember to 
compile libdrm with Valgrind headers present, so that buffer allocs are 
annotated for Valgrind?




There's also some leaks both with master and your branch but your branch
has less, so it seems to fix some things.



- Eero


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 00/15] GLSL memory allocation rework for faster compilation

2016-10-21 Thread Marek Olšák
On Oct 21, 2016 1:07 PM, "Tapani Pälli"  wrote:
>
>
>
> On 10/21/2016 12:51 PM, Marek Olšák wrote:
>>
>> On Fri, Oct 21, 2016 at 6:58 AM, Tapani Pälli 
wrote:
>>>
>>>
>>>
>>> On 10/20/2016 09:24 PM, Marek Olšák wrote:


 On Thu, Oct 20, 2016 at 6:31 PM, Tapani Pälli 
 wrote:
>
>
> On 10/20/2016 06:55 PM, Marek Olšák wrote:
>>
>>
>>
>> On Mon, Oct 17, 2016 at 9:03 PM, Marek Olšák 
wrote:
>>>
>>>
>>>
>>> Hi,
>>>
>>> The latest branch:
>>> https://cgit.freedesktop.org/~mareko/mesa/log/?h=glsl-alloc-rework2
>>>
>>> It contains:
>>> - all review comments resolved
>>> - commits from Tapani's jenkins branch (fixes for glsl, nir, i965)
>>>
>>> My updated patches are also on the list if you wanna review them.
>>
>>
>>
>> Since there are no other comments, it looks like it's ready to land.
>> i965 and Gallium were tested by Tapani and me, respectively. The
>> branch contains all fixes first, followed by the allocation changes.
>
>
>
>
> I haven't had time for further testing but it would make sense to try
> also
> with some benchmarks and steam games on i965. JP recalled there was
> something that got triggered by Manhattan earlier with his patches.
I'll
> try
> to get this testing done tomorrow.



 I suggest you put shaders from Manhattan into your shader-db, so that
 you don't have to run the game under valgrind. Also in future, it's
 better to use shader-db instead of running games for stuff like this.

>>>
>>> Makes sense as ralloc is mostly used in compiler but I think some of
those
>>> rallocated structures get also accessed when app calls GL API. I'll run
at
>>> least gfxbench manually just as a paranoid check to see if there's
anything.
>>>
>>> About shader-db .. should we consider making a scripts folder there and
>>> start sharing some? It's used for instruction count analysis, memory
>>> analysis .. would be nice to have some sort of 'standard' skeleton
scripts
>>> for this which can be then modified for purposes.
>>
>>
>> Not sure what you mean. Our script for shader-db reporting is
si-report.py.
>>
>
> I mean scripts for comparing runs. For example for compile time
performance one might use Eric Anholt's compare-perf but that requires some
additional scripts that print correct output from run. Or maybe for leaks
it would be interesting to have a comparison summary script of memory leaks
between runs? Current scripts seem to only report instruction count
difference between 2 runs?

Well, it only reports instruction count difference for Intel. Our stats are
completely different.  I use the "time" command to measure compiler
performance. It seems to do the job well.

>
> I did run some valgrind comparisons with gfxbench4, your branch against
Mesa master. I did not spot anything obvious related to ralloc.
>
> On i965 there's a huge load of invalid writes's and read's in general but
this happens on master as well so maybe these are not 'valid hits'. There's
also some leaks both with master and your branch but your branch has less,
so it seems to fix some things.

It can't fix anything. Both allocators are hierarchical, so most code
doesn't care about freeing memory.

I usually don't see any invalid reads and writes with radeonsi.

Marek

>
> // Tapani
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 00/15] GLSL memory allocation rework for faster compilation

2016-10-21 Thread Tapani Pälli



On 10/21/2016 12:51 PM, Marek Olšák wrote:

On Fri, Oct 21, 2016 at 6:58 AM, Tapani Pälli  wrote:



On 10/20/2016 09:24 PM, Marek Olšák wrote:


On Thu, Oct 20, 2016 at 6:31 PM, Tapani Pälli 
wrote:


On 10/20/2016 06:55 PM, Marek Olšák wrote:



On Mon, Oct 17, 2016 at 9:03 PM, Marek Olšák  wrote:



Hi,

The latest branch:
https://cgit.freedesktop.org/~mareko/mesa/log/?h=glsl-alloc-rework2

It contains:
- all review comments resolved
- commits from Tapani's jenkins branch (fixes for glsl, nir, i965)

My updated patches are also on the list if you wanna review them.



Since there are no other comments, it looks like it's ready to land.
i965 and Gallium were tested by Tapani and me, respectively. The
branch contains all fixes first, followed by the allocation changes.




I haven't had time for further testing but it would make sense to try
also
with some benchmarks and steam games on i965. JP recalled there was
something that got triggered by Manhattan earlier with his patches. I'll
try
to get this testing done tomorrow.



I suggest you put shaders from Manhattan into your shader-db, so that
you don't have to run the game under valgrind. Also in future, it's
better to use shader-db instead of running games for stuff like this.



Makes sense as ralloc is mostly used in compiler but I think some of those
rallocated structures get also accessed when app calls GL API. I'll run at
least gfxbench manually just as a paranoid check to see if there's anything.

About shader-db .. should we consider making a scripts folder there and
start sharing some? It's used for instruction count analysis, memory
analysis .. would be nice to have some sort of 'standard' skeleton scripts
for this which can be then modified for purposes.


Not sure what you mean. Our script for shader-db reporting is si-report.py.



I mean scripts for comparing runs. For example for compile time 
performance one might use Eric Anholt's compare-perf but that requires 
some additional scripts that print correct output from run. Or maybe for 
leaks it would be interesting to have a comparison summary script of 
memory leaks between runs? Current scripts seem to only report 
instruction count difference between 2 runs?


I did run some valgrind comparisons with gfxbench4, your branch against 
Mesa master. I did not spot anything obvious related to ralloc.


On i965 there's a huge load of invalid writes's and read's in general 
but this happens on master as well so maybe these are not 'valid hits'. 
There's also some leaks both with master and your branch but your branch 
has less, so it seems to fix some things.


// Tapani
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 00/15] GLSL memory allocation rework for faster compilation

2016-10-21 Thread Marek Olšák
On Fri, Oct 21, 2016 at 6:58 AM, Tapani Pälli  wrote:
>
>
> On 10/20/2016 09:24 PM, Marek Olšák wrote:
>>
>> On Thu, Oct 20, 2016 at 6:31 PM, Tapani Pälli 
>> wrote:
>>>
>>> On 10/20/2016 06:55 PM, Marek Olšák wrote:


 On Mon, Oct 17, 2016 at 9:03 PM, Marek Olšák  wrote:
>
>
> Hi,
>
> The latest branch:
> https://cgit.freedesktop.org/~mareko/mesa/log/?h=glsl-alloc-rework2
>
> It contains:
> - all review comments resolved
> - commits from Tapani's jenkins branch (fixes for glsl, nir, i965)
>
> My updated patches are also on the list if you wanna review them.


 Since there are no other comments, it looks like it's ready to land.
 i965 and Gallium were tested by Tapani and me, respectively. The
 branch contains all fixes first, followed by the allocation changes.
>>>
>>>
>>>
>>> I haven't had time for further testing but it would make sense to try
>>> also
>>> with some benchmarks and steam games on i965. JP recalled there was
>>> something that got triggered by Manhattan earlier with his patches. I'll
>>> try
>>> to get this testing done tomorrow.
>>
>>
>> I suggest you put shaders from Manhattan into your shader-db, so that
>> you don't have to run the game under valgrind. Also in future, it's
>> better to use shader-db instead of running games for stuff like this.
>>
>
> Makes sense as ralloc is mostly used in compiler but I think some of those
> rallocated structures get also accessed when app calls GL API. I'll run at
> least gfxbench manually just as a paranoid check to see if there's anything.
>
> About shader-db .. should we consider making a scripts folder there and
> start sharing some? It's used for instruction count analysis, memory
> analysis .. would be nice to have some sort of 'standard' skeleton scripts
> for this which can be then modified for purposes.

Not sure what you mean. Our script for shader-db reporting is si-report.py.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 00/15] GLSL memory allocation rework for faster compilation

2016-10-20 Thread Tapani Pälli



On 10/20/2016 09:24 PM, Marek Olšák wrote:

On Thu, Oct 20, 2016 at 6:31 PM, Tapani Pälli  wrote:

On 10/20/2016 06:55 PM, Marek Olšák wrote:


On Mon, Oct 17, 2016 at 9:03 PM, Marek Olšák  wrote:


Hi,

The latest branch:
https://cgit.freedesktop.org/~mareko/mesa/log/?h=glsl-alloc-rework2

It contains:
- all review comments resolved
- commits from Tapani's jenkins branch (fixes for glsl, nir, i965)

My updated patches are also on the list if you wanna review them.


Since there are no other comments, it looks like it's ready to land.
i965 and Gallium were tested by Tapani and me, respectively. The
branch contains all fixes first, followed by the allocation changes.



I haven't had time for further testing but it would make sense to try also
with some benchmarks and steam games on i965. JP recalled there was
something that got triggered by Manhattan earlier with his patches. I'll try
to get this testing done tomorrow.


I suggest you put shaders from Manhattan into your shader-db, so that
you don't have to run the game under valgrind. Also in future, it's
better to use shader-db instead of running games for stuff like this.



Makes sense as ralloc is mostly used in compiler but I think some of 
those rallocated structures get also accessed when app calls GL API. 
I'll run at least gfxbench manually just as a paranoid check to see if 
there's anything.


About shader-db .. should we consider making a scripts folder there and 
start sharing some? It's used for instruction count analysis, memory 
analysis .. would be nice to have some sort of 'standard' skeleton 
scripts for this which can be then modified for purposes.


// Tapani
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 00/15] GLSL memory allocation rework for faster compilation

2016-10-20 Thread Marek Olšák
On Thu, Oct 20, 2016 at 6:31 PM, Tapani Pälli  wrote:
> On 10/20/2016 06:55 PM, Marek Olšák wrote:
>>
>> On Mon, Oct 17, 2016 at 9:03 PM, Marek Olšák  wrote:
>>>
>>> Hi,
>>>
>>> The latest branch:
>>> https://cgit.freedesktop.org/~mareko/mesa/log/?h=glsl-alloc-rework2
>>>
>>> It contains:
>>> - all review comments resolved
>>> - commits from Tapani's jenkins branch (fixes for glsl, nir, i965)
>>>
>>> My updated patches are also on the list if you wanna review them.
>>
>> Since there are no other comments, it looks like it's ready to land.
>> i965 and Gallium were tested by Tapani and me, respectively. The
>> branch contains all fixes first, followed by the allocation changes.
>
>
> I haven't had time for further testing but it would make sense to try also
> with some benchmarks and steam games on i965. JP recalled there was
> something that got triggered by Manhattan earlier with his patches. I'll try
> to get this testing done tomorrow.

I suggest you put shaders from Manhattan into your shader-db, so that
you don't have to run the game under valgrind. Also in future, it's
better to use shader-db instead of running games for stuff like this.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 00/15] GLSL memory allocation rework for faster compilation

2016-10-20 Thread Tapani Pälli

On 10/20/2016 06:55 PM, Marek Olšák wrote:

On Mon, Oct 17, 2016 at 9:03 PM, Marek Olšák  wrote:

Hi,

The latest branch:
https://cgit.freedesktop.org/~mareko/mesa/log/?h=glsl-alloc-rework2

It contains:
- all review comments resolved
- commits from Tapani's jenkins branch (fixes for glsl, nir, i965)

My updated patches are also on the list if you wanna review them.

Since there are no other comments, it looks like it's ready to land.
i965 and Gallium were tested by Tapani and me, respectively. The
branch contains all fixes first, followed by the allocation changes.


I haven't had time for further testing but it would make sense to try 
also with some benchmarks and steam games on i965. JP recalled there was 
something that got triggered by Manhattan earlier with his patches. I'll 
try to get this testing done tomorrow.



Please let me know if you need more time for reviewing, or notify
people who may want to review this but missed it. Otherwise, I'll push
this on Saturday (Central Europe time, so US folks should answer
before Saturday).

Marek



// Tapani

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 00/15] GLSL memory allocation rework for faster compilation

2016-10-20 Thread Marek Olšák
On Mon, Oct 17, 2016 at 9:03 PM, Marek Olšák  wrote:
> Hi,
>
> The latest branch:
> https://cgit.freedesktop.org/~mareko/mesa/log/?h=glsl-alloc-rework2
>
> It contains:
> - all review comments resolved
> - commits from Tapani's jenkins branch (fixes for glsl, nir, i965)
>
> My updated patches are also on the list if you wanna review them.

Since there are no other comments, it looks like it's ready to land.
i965 and Gallium were tested by Tapani and me, respectively. The
branch contains all fixes first, followed by the allocation changes.

Please let me know if you need more time for reviewing, or notify
people who may want to review this but missed it. Otherwise, I'll push
this on Saturday (Central Europe time, so US folks should answer
before Saturday).

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 00/15] GLSL memory allocation rework for faster compilation

2016-10-17 Thread Marek Olšák
Hi,

The latest branch:
https://cgit.freedesktop.org/~mareko/mesa/log/?h=glsl-alloc-rework2

It contains:
- all review comments resolved
- commits from Tapani's jenkins branch (fixes for glsl, nir, i965)

My updated patches are also on the list if you wanna review them.

Marek

On Thu, Oct 13, 2016 at 4:43 PM, Tapani Pälli  wrote:
> On 10/13/2016 04:20 PM, Juha-Pekka Heikkila wrote:
>>
>> I forgot to reply here on the list, I've just been talking about this with
>> Tapani face to face.
>>
>> My series rebased and fixed on top of mesa master branch from yesterday is
>> here
>> https://github.com/juhapekka/juha_mesaexperimentals/tree/jenkins
>>
>> Tapani was already taking rebased patches from above branch.
>>
>> I originally stopped working on this set because I felt there was too much
>> uncertainty if all places needed to be fixed could be found easily. Anyway,
>> if you skip my patch for changes in glsl please check you have all places
>> somehow handled which I had patched. All those patched places I dug up with
>> Valgrind so they're 'real deal' where will get segfaults.
>>
>
> I have now all CI regressions (there were 26 in total) passing with this
> set:
>
> https://cgit.freedesktop.org/~tpalli/mesa/log/?h=jenkins
>
> but I'm planning still todo some validation with apps too, as you mentioned
> today as example Manhattan used to trigger some issues.
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 00/15] GLSL memory allocation rework for faster compilation

2016-10-13 Thread Tapani Pälli

On 10/13/2016 04:20 PM, Juha-Pekka Heikkila wrote:
I forgot to reply here on the list, I've just been talking about this 
with Tapani face to face.


My series rebased and fixed on top of mesa master branch from 
yesterday is here

https://github.com/juhapekka/juha_mesaexperimentals/tree/jenkins

Tapani was already taking rebased patches from above branch.

I originally stopped working on this set because I felt there was too 
much uncertainty if all places needed to be fixed could be found 
easily. Anyway, if you skip my patch for changes in glsl please check 
you have all places somehow handled which I had patched. All those 
patched places I dug up with Valgrind so they're 'real deal' where 
will get segfaults.




I have now all CI regressions (there were 26 in total) passing with this 
set:


https://cgit.freedesktop.org/~tpalli/mesa/log/?h=jenkins

but I'm planning still todo some validation with apps too, as you 
mentioned today as example Manhattan used to trigger some issues.



/Juha-Pekka

On 10.10.2016 14:52, Marek Olšák wrote:

I prefer some of my GLSL fixes in 1-4 over JP's changes, because they
seem cleaner to me.

Marek


On Oct 10, 2016 1:38 PM, "Tapani Pälli" > wrote:



On 10/10/2016 02:27 PM, Marek Olšák wrote:

On Mon, Oct 10, 2016 at 1:25 PM, Tapani Pälli
> wrote:



On 10/10/2016 01:38 PM, Marek Olšák wrote:


On Mon, Oct 10, 2016 at 12:33 PM, Marek Olšák
> wrote:


On Mon, Oct 10, 2016 at 7:58 AM, Tapani Pälli
>

wrote:




On 10/08/2016 06:58 PM, Jason Ekstrand wrote:



FYI, we use ralloc for a lot more than just
the glsl compiler so the
first few changes make me a bit nervous.
There was someone working on
making our driver more I
undefined-memory-friendly but I don't know
what
happened to those patches.




There's bunch of patches like that in this 
series:

https://lists.freedesktop.org/archives/mesa-dev/2016-June/120445.html


it looks like it just never landed as would have
required more testing
on
misc drivers?



We can land at least some of the patches from that
series. We still
have to replace all non-GLSL uses of
DECLARE_RALLOC.. with
DECLARE_RZALLOC.



BTW, people can still give Rbs on all patches except 5.
This rzalloc
thing isn't an issue and can be dealt with in a separate
series (it
can be done after this series lands).



I agree these issues do not block review of the series. We
just need to make
sure it is absolutely safe before landing.

As concrete example I got following segfault when I applied
this series
which is directly related to rzalloc issues. This was with
'shader_freeze'
program, description in bug #94477 has link and build
instructions for this
if you want to try. When I applied JP's patches 4,5,6 (nir,
i965_vec4,
i965_fs changes) this segfault disappears.


I meant that this series is safe to land without patch 5. Did
you test
it without patch 5?


Ah sorry I managed to miss that. Now I did test and when reverting
patch 5 this test passes fine. Makes sense to do patch 5 as a
separate step when JP's changes land.

// Tapani



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 00/15] GLSL memory allocation rework for faster compilation

2016-10-13 Thread Juha-Pekka Heikkila
I forgot to reply here on the list, I've just been talking about this 
with Tapani face to face.


My series rebased and fixed on top of mesa master branch from yesterday 
is here

https://github.com/juhapekka/juha_mesaexperimentals/tree/jenkins

Tapani was already taking rebased patches from above branch.

I originally stopped working on this set because I felt there was too 
much uncertainty if all places needed to be fixed could be found easily. 
Anyway, if you skip my patch for changes in glsl please check you have 
all places somehow handled which I had patched. All those patched places 
I dug up with Valgrind so they're 'real deal' where will get segfaults.


/Juha-Pekka

On 10.10.2016 14:52, Marek Olšák wrote:

I prefer some of my GLSL fixes in 1-4 over JP's changes, because they
seem cleaner to me.

Marek


On Oct 10, 2016 1:38 PM, "Tapani Pälli" > wrote:



On 10/10/2016 02:27 PM, Marek Olšák wrote:

On Mon, Oct 10, 2016 at 1:25 PM, Tapani Pälli
> wrote:



On 10/10/2016 01:38 PM, Marek Olšák wrote:


On Mon, Oct 10, 2016 at 12:33 PM, Marek Olšák
> wrote:


On Mon, Oct 10, 2016 at 7:58 AM, Tapani Pälli
>
wrote:




On 10/08/2016 06:58 PM, Jason Ekstrand wrote:



FYI, we use ralloc for a lot more than just
the glsl compiler so the
first few changes make me a bit nervous.
There was someone working on
making our driver more I
undefined-memory-friendly but I don't know
what
happened to those patches.




There's bunch of patches like that in this series:

https://lists.freedesktop.org/archives/mesa-dev/2016-June/120445.html



it looks like it just never landed as would have
required more testing
on
misc drivers?



We can land at least some of the patches from that
series. We still
have to replace all non-GLSL uses of
DECLARE_RALLOC.. with
DECLARE_RZALLOC.



BTW, people can still give Rbs on all patches except 5.
This rzalloc
thing isn't an issue and can be dealt with in a separate
series (it
can be done after this series lands).



I agree these issues do not block review of the series. We
just need to make
sure it is absolutely safe before landing.

As concrete example I got following segfault when I applied
this series
which is directly related to rzalloc issues. This was with
'shader_freeze'
program, description in bug #94477 has link and build
instructions for this
if you want to try. When I applied JP's patches 4,5,6 (nir,
i965_vec4,
i965_fs changes) this segfault disappears.


I meant that this series is safe to land without patch 5. Did
you test
it without patch 5?


Ah sorry I managed to miss that. Now I did test and when reverting
patch 5 this test passes fine. Makes sense to do patch 5 as a
separate step when JP's changes land.

// Tapani



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 00/15] GLSL memory allocation rework for faster compilation

2016-10-12 Thread Nicolai Hähnle

Nice work! I sent some comments on patches 6 & 7. Patches 1-4 and 7-15 are

Reviewed-by: Nicolai Hähnle 

assuming you checked them with a Piglit run in addition to the shader-db.

Something that LLVM does for its intermediate representations is using 
Recycler objects. Instructions are allocated from a linear allocator, 
but when they are removed they are neither returned to the heap nor 
simply forgotten. Instead, the memory block is added to a linked list 
managed by the Recycler object, so that the next instruction allocation 
can be served from there.


I suspect that this could also help here because it's still very fast 
but keeps the cache footprint smaller.


Cheers,
Nicolai

On 08.10.2016 12:58, Marek Olšák wrote:

Hi,

This patch series reduces the number of malloc calls in the GLSL
compiler by 63%. That leads to better compile times and less heap
thrashing.

It's done by switching memory allocations in the GLSL compiler to my
new linear allocator that allocates out of a fixed-sized buffer with
a monotonically increasing offset. If more buffers are needed, it
chains them.

The new allocator is used in all places where short-lived allocations
are used with a high number of malloc calls. The series also contains
other improvements not related to the new allocator that also improve
compile times. The results are below.

I tested my shader-db with shaders only being compiled to TGSI.
(noop gallium driver)


master + libc's malloc:

 real   0m54.182s
 user   3m33.640s
 sys0m0.620s
 maxmem 275 MB


master + jemalloc preloaded:

 real   0m45.044s
 user   2m56.356s
 sys0m1.652s
 maxmem 284 MB


the series + libc's malloc:

 real   0m46.221s
 user   3m2.080s
 sys0m0.544s
 maxmem 270 MB


the series + jemalloc preloaded:

 real   0m40.729s
 user   2m39.564s
 sys0m1.232s
 maxmem 284 MB


The series without jemalloc almost caught up with jemalloc + master.
However, jemalloc also benefits.

Current Mesa needs 54.182s and it drops to 40.729s with my series and
jemalloc. The total change in compile time is -25% if we incorporate
both. Without jemalloc, the difference is only -14.7%.

With radeonsi, the improvement is approx. slightly more than 1/2 of that
(if you add the LLVM time). However, radeonsi also has asynchronous
shader compilation hiding LLVM overhead in some cases, so it depends.

Drivers with faster compiler backends will benefit more than radeonsi,
but will probably not reach -25% or -14.7% (except softpipe, which uses
TGSI as-is).

The memory usage looks reasonable in all tested cases.

Note: One of the first patches moves memset from ralloc to rzalloc.
I tested and fixed the GLSL source -> TGSI path, but other codepaths
may break, and you need to use valgrind to find all uninitialized
variables that relied on ralloc doing memset (if there are any).

You can also find it here:
https://cgit.freedesktop.org/~mareko/mesa/log/?h=glsl-alloc-rework

Please review.

 src/compiler/glsl/ast.h |   4 +-
 src/compiler/glsl/ast_to_hir.cpp|   4 +-
 src/compiler/glsl/ast_type.cpp  |  13 ++-
 src/compiler/glsl/glcpp/glcpp-lex.l |   2 +-
 src/compiler/glsl/glcpp/glcpp-parse.y   | 203 
+-
 src/compiler/glsl/glcpp/glcpp.h |   1 +
 src/compiler/glsl/glsl_lexer.ll |  16 +--
 src/compiler/glsl/glsl_parser.yy| 202 
+++---
 src/compiler/glsl/glsl_parser_extras.cpp|   6 +-
 src/compiler/glsl/glsl_parser_extras.h  |   4 +-
 src/compiler/glsl/glsl_symbol_table.cpp |  19 ++--
 src/compiler/glsl/glsl_symbol_table.h   |   1 +
 src/compiler/glsl/ir.cpp|   4 +
 src/compiler/glsl/ir.h  |  13 ++-
 src/compiler/glsl/link_uniform_blocks.cpp   |   2 +-
 src/compiler/glsl/list.h|   2 +-
 src/compiler/glsl/lower_packed_varyings.cpp |   8 +-
 src/compiler/glsl/opt_constant_propagation.cpp  |  14 ++-
 src/compiler/glsl/opt_copy_propagation.cpp  |   7 +-
 src/compiler/glsl/opt_copy_propagation_elements.cpp |  19 ++--
 src/compiler/glsl/opt_dead_code_local.cpp   |  12 ++-
 src/compiler/glsl_types.cpp |  38 +--
 src/compiler/glsl_types.h   |   6 +-
 src/compiler/nir/nir.c  |   8 +-
 src/compiler/spirv/vtn_variables.c  |   3 +-
 src/gallium/drivers/freedreno/ir3/ir3.c |   2 +-
 src/gallium/drivers/vc4/vc4_cl.c|   2 +-
 src/gallium/drivers/vc4/vc4_program.c   |   2 +-
 src/gallium/drivers/vc4/vc4_simulator.c |   5 +-
 src/mesa/drivers/dri/i965/brw_state_batch.c |   5 +-
 src/util/ralloc.c   | 392 

Re: [Mesa-dev] [PATCH 00/15] GLSL memory allocation rework for faster compilation

2016-10-11 Thread Marek Olšák
On Tue, Oct 11, 2016 at 6:54 AM, Tapani Pälli  wrote:
>
>
> On 10/10/2016 02:52 PM, Marek Olšák wrote:
>>
>> I prefer some of my GLSL fixes in 1-4 over JP's changes, because they
>> seem cleaner to me.
>
>
> Agreed, I was considering following patches from JP:
>
> https://patchwork.freedesktop.org/patch/93266/
> https://patchwork.freedesktop.org/patch/93262/
> https://patchwork.freedesktop.org/patch/93267/
>
> these could be pushed separately and do not cause any functional change.

Yeah, absolutely.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 00/15] GLSL memory allocation rework for faster compilation

2016-10-10 Thread Tapani Pälli



On 10/10/2016 02:52 PM, Marek Olšák wrote:

I prefer some of my GLSL fixes in 1-4 over JP's changes, because they
seem cleaner to me.


Agreed, I was considering following patches from JP:

https://patchwork.freedesktop.org/patch/93266/
https://patchwork.freedesktop.org/patch/93262/
https://patchwork.freedesktop.org/patch/93267/

these could be pushed separately and do not cause any functional change.



Marek


On Oct 10, 2016 1:38 PM, "Tapani Pälli" > wrote:



On 10/10/2016 02:27 PM, Marek Olšák wrote:

On Mon, Oct 10, 2016 at 1:25 PM, Tapani Pälli
> wrote:



On 10/10/2016 01:38 PM, Marek Olšák wrote:


On Mon, Oct 10, 2016 at 12:33 PM, Marek Olšák
> wrote:


On Mon, Oct 10, 2016 at 7:58 AM, Tapani Pälli
>
wrote:




On 10/08/2016 06:58 PM, Jason Ekstrand wrote:



FYI, we use ralloc for a lot more than just
the glsl compiler so the
first few changes make me a bit nervous.
There was someone working on
making our driver more I
undefined-memory-friendly but I don't know
what
happened to those patches.




There's bunch of patches like that in this series:

https://lists.freedesktop.org/archives/mesa-dev/2016-June/120445.html



it looks like it just never landed as would have
required more testing
on
misc drivers?



We can land at least some of the patches from that
series. We still
have to replace all non-GLSL uses of
DECLARE_RALLOC.. with
DECLARE_RZALLOC.



BTW, people can still give Rbs on all patches except 5.
This rzalloc
thing isn't an issue and can be dealt with in a separate
series (it
can be done after this series lands).



I agree these issues do not block review of the series. We
just need to make
sure it is absolutely safe before landing.

As concrete example I got following segfault when I applied
this series
which is directly related to rzalloc issues. This was with
'shader_freeze'
program, description in bug #94477 has link and build
instructions for this
if you want to try. When I applied JP's patches 4,5,6 (nir,
i965_vec4,
i965_fs changes) this segfault disappears.


I meant that this series is safe to land without patch 5. Did
you test
it without patch 5?


Ah sorry I managed to miss that. Now I did test and when reverting
patch 5 this test passes fine. Makes sense to do patch 5 as a
separate step when JP's changes land.

// Tapani


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 00/15] GLSL memory allocation rework for faster compilation

2016-10-10 Thread Marek Olšák
I prefer some of my GLSL fixes in 1-4 over JP's changes, because they seem
cleaner to me.

Marek

On Oct 10, 2016 1:38 PM, "Tapani Pälli"  wrote:

>
>
> On 10/10/2016 02:27 PM, Marek Olšák wrote:
>
>> On Mon, Oct 10, 2016 at 1:25 PM, Tapani Pälli 
>> wrote:
>>
>>>
>>>
>>> On 10/10/2016 01:38 PM, Marek Olšák wrote:
>>>

 On Mon, Oct 10, 2016 at 12:33 PM, Marek Olšák  wrote:

>
> On Mon, Oct 10, 2016 at 7:58 AM, Tapani Pälli 
> wrote:
>
>>
>>
>>
>> On 10/08/2016 06:58 PM, Jason Ekstrand wrote:
>>
>>>
>>>
>>> FYI, we use ralloc for a lot more than just the glsl compiler so the
>>> first few changes make me a bit nervous.  There was someone working
>>> on
>>> making our driver more I undefined-memory-friendly but I don't know
>>> what
>>> happened to those patches.
>>>
>>
>>
>>
>> There's bunch of patches like that in this series:
>> https://lists.freedesktop.org/archives/mesa-dev/2016-June/120445.html
>>
>> it looks like it just never landed as would have required more testing
>> on
>> misc drivers?
>>
>
>
> We can land at least some of the patches from that series. We still
> have to replace all non-GLSL uses of DECLARE_RALLOC.. with
> DECLARE_RZALLOC.
>


 BTW, people can still give Rbs on all patches except 5. This rzalloc
 thing isn't an issue and can be dealt with in a separate series (it
 can be done after this series lands).

>>>
>>>
>>> I agree these issues do not block review of the series. We just need to
>>> make
>>> sure it is absolutely safe before landing.
>>>
>>> As concrete example I got following segfault when I applied this series
>>> which is directly related to rzalloc issues. This was with
>>> 'shader_freeze'
>>> program, description in bug #94477 has link and build instructions for
>>> this
>>> if you want to try. When I applied JP's patches 4,5,6 (nir, i965_vec4,
>>> i965_fs changes) this segfault disappears.
>>>
>>
>> I meant that this series is safe to land without patch 5. Did you test
>> it without patch 5?
>>
>>
> Ah sorry I managed to miss that. Now I did test and when reverting patch 5
> this test passes fine. Makes sense to do patch 5 as a separate step when
> JP's changes land.
>
> // Tapani
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 00/15] GLSL memory allocation rework for faster compilation

2016-10-10 Thread Tapani Pälli



On 10/10/2016 02:27 PM, Marek Olšák wrote:

On Mon, Oct 10, 2016 at 1:25 PM, Tapani Pälli  wrote:



On 10/10/2016 01:38 PM, Marek Olšák wrote:


On Mon, Oct 10, 2016 at 12:33 PM, Marek Olšák  wrote:


On Mon, Oct 10, 2016 at 7:58 AM, Tapani Pälli 
wrote:




On 10/08/2016 06:58 PM, Jason Ekstrand wrote:



FYI, we use ralloc for a lot more than just the glsl compiler so the
first few changes make me a bit nervous.  There was someone working on
making our driver more I undefined-memory-friendly but I don't know
what
happened to those patches.




There's bunch of patches like that in this series:
https://lists.freedesktop.org/archives/mesa-dev/2016-June/120445.html

it looks like it just never landed as would have required more testing
on
misc drivers?



We can land at least some of the patches from that series. We still
have to replace all non-GLSL uses of DECLARE_RALLOC.. with
DECLARE_RZALLOC.



BTW, people can still give Rbs on all patches except 5. This rzalloc
thing isn't an issue and can be dealt with in a separate series (it
can be done after this series lands).



I agree these issues do not block review of the series. We just need to make
sure it is absolutely safe before landing.

As concrete example I got following segfault when I applied this series
which is directly related to rzalloc issues. This was with 'shader_freeze'
program, description in bug #94477 has link and build instructions for this
if you want to try. When I applied JP's patches 4,5,6 (nir, i965_vec4,
i965_fs changes) this segfault disappears.


I meant that this series is safe to land without patch 5. Did you test
it without patch 5?



Ah sorry I managed to miss that. Now I did test and when reverting patch 
5 this test passes fine. Makes sense to do patch 5 as a separate step 
when JP's changes land.


// Tapani
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 00/15] GLSL memory allocation rework for faster compilation

2016-10-10 Thread Marek Olšák
On Mon, Oct 10, 2016 at 1:25 PM, Tapani Pälli  wrote:
>
>
> On 10/10/2016 01:38 PM, Marek Olšák wrote:
>>
>> On Mon, Oct 10, 2016 at 12:33 PM, Marek Olšák  wrote:
>>>
>>> On Mon, Oct 10, 2016 at 7:58 AM, Tapani Pälli 
>>> wrote:



 On 10/08/2016 06:58 PM, Jason Ekstrand wrote:
>
>
> FYI, we use ralloc for a lot more than just the glsl compiler so the
> first few changes make me a bit nervous.  There was someone working on
> making our driver more I undefined-memory-friendly but I don't know
> what
> happened to those patches.



 There's bunch of patches like that in this series:
 https://lists.freedesktop.org/archives/mesa-dev/2016-June/120445.html

 it looks like it just never landed as would have required more testing
 on
 misc drivers?
>>>
>>>
>>> We can land at least some of the patches from that series. We still
>>> have to replace all non-GLSL uses of DECLARE_RALLOC.. with
>>> DECLARE_RZALLOC.
>>
>>
>> BTW, people can still give Rbs on all patches except 5. This rzalloc
>> thing isn't an issue and can be dealt with in a separate series (it
>> can be done after this series lands).
>
>
> I agree these issues do not block review of the series. We just need to make
> sure it is absolutely safe before landing.
>
> As concrete example I got following segfault when I applied this series
> which is directly related to rzalloc issues. This was with 'shader_freeze'
> program, description in bug #94477 has link and build instructions for this
> if you want to try. When I applied JP's patches 4,5,6 (nir, i965_vec4,
> i965_fs changes) this segfault disappears.

I meant that this series is safe to land without patch 5. Did you test
it without patch 5?

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 00/15] GLSL memory allocation rework for faster compilation

2016-10-10 Thread Tapani Pälli



On 10/10/2016 01:38 PM, Marek Olšák wrote:

On Mon, Oct 10, 2016 at 12:33 PM, Marek Olšák  wrote:

On Mon, Oct 10, 2016 at 7:58 AM, Tapani Pälli  wrote:



On 10/08/2016 06:58 PM, Jason Ekstrand wrote:


FYI, we use ralloc for a lot more than just the glsl compiler so the
first few changes make me a bit nervous.  There was someone working on
making our driver more I undefined-memory-friendly but I don't know what
happened to those patches.



There's bunch of patches like that in this series:
https://lists.freedesktop.org/archives/mesa-dev/2016-June/120445.html

it looks like it just never landed as would have required more testing on
misc drivers?


We can land at least some of the patches from that series. We still
have to replace all non-GLSL uses of DECLARE_RALLOC.. with
DECLARE_RZALLOC.


BTW, people can still give Rbs on all patches except 5. This rzalloc
thing isn't an issue and can be dealt with in a separate series (it
can be done after this series lands).


I agree these issues do not block review of the series. We just need to 
make sure it is absolutely safe before landing.


As concrete example I got following segfault when I applied this series 
which is directly related to rzalloc issues. This was with 
'shader_freeze' program, description in bug #94477 has link and build 
instructions for this if you want to try. When I applied JP's patches 
4,5,6 (nir, i965_vec4, i965_fs changes) this segfault disappears.


Please JP (CC) rebase and resend those patches!

--- 8< ---

#0  0x767366f5 in raise () from /lib64/libc.so.6
#1  0x767382fa in abort () from /lib64/libc.so.6
#2  0x76777670 in __libc_message () from /lib64/libc.so.6
#3  0x76783849 in realloc () from /lib64/libc.so.6
#4  0x70c83741 in resize (ptr=0x71208490 ir_assignment+16>, size=) at ralloc.c:161
#5  0x70c839e3 in reralloc_size (ctx=ctx@entry=0x11d8298, 
ptr=, size=) at ralloc.c:192
#6  0x70c83a72 in reralloc_array_size (ctx=ctx@entry=0x11d8298, 
ptr=, size=size@entry=4, count=) at 
ralloc.c:219
#7  0x70ccdcf9 in init_liveness_block (state=0x7fffafa0, 
block=0x11d8298) at nir/nir_liveness.c:76
#8  nir_live_ssa_defs_impl (impl=impl@entry=0x11d81a8) at 
nir/nir_liveness.c:180
#9  0x70d02e18 in nir_metadata_require 
(impl=impl@entry=0x11d81a8, 
required=required@entry=(nir_metadata_dominance | 
nir_metadata_live_ssa_defs)) at nir/nir_metadata.c:43
#10 0x70ccc191 in nir_convert_from_ssa_impl (impl=0x11d81a8, 
phi_webs_only=phi_webs_only@entry=true) at nir/nir_from_ssa.c:778
#11 0x70ccc610 in nir_convert_from_ssa 
(shader=shader@entry=0x11d7f08, phi_webs_only=phi_webs_only@entry=true) 
at nir/nir_from_ssa.c:810
#12 0x70e35d8d in brw_postprocess_nir (nir=nir@entry=0x11d7f08, 
devinfo=0x6566ec, is_scalar=is_scalar@entry=true) at brw_nir.c:538




// Tapani
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 00/15] GLSL memory allocation rework for faster compilation

2016-10-10 Thread Marek Olšák
On Mon, Oct 10, 2016 at 12:33 PM, Marek Olšák  wrote:
> On Mon, Oct 10, 2016 at 7:58 AM, Tapani Pälli  wrote:
>>
>>
>> On 10/08/2016 06:58 PM, Jason Ekstrand wrote:
>>>
>>> FYI, we use ralloc for a lot more than just the glsl compiler so the
>>> first few changes make me a bit nervous.  There was someone working on
>>> making our driver more I undefined-memory-friendly but I don't know what
>>> happened to those patches.
>>
>>
>> There's bunch of patches like that in this series:
>> https://lists.freedesktop.org/archives/mesa-dev/2016-June/120445.html
>>
>> it looks like it just never landed as would have required more testing on
>> misc drivers?
>
> We can land at least some of the patches from that series. We still
> have to replace all non-GLSL uses of DECLARE_RALLOC.. with
> DECLARE_RZALLOC.

BTW, people can still give Rbs on all patches except 5. This rzalloc
thing isn't an issue and can be dealt with in a separate series (it
can be done after this series lands).

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 00/15] GLSL memory allocation rework for faster compilation

2016-10-10 Thread Marek Olšák
On Mon, Oct 10, 2016 at 7:58 AM, Tapani Pälli  wrote:
>
>
> On 10/08/2016 06:58 PM, Jason Ekstrand wrote:
>>
>> FYI, we use ralloc for a lot more than just the glsl compiler so the
>> first few changes make me a bit nervous.  There was someone working on
>> making our driver more I undefined-memory-friendly but I don't know what
>> happened to those patches.
>
>
> There's bunch of patches like that in this series:
> https://lists.freedesktop.org/archives/mesa-dev/2016-June/120445.html
>
> it looks like it just never landed as would have required more testing on
> misc drivers?

We can land at least some of the patches from that series. We still
have to replace all non-GLSL uses of DECLARE_RALLOC.. with
DECLARE_RZALLOC.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 00/15] GLSL memory allocation rework for faster compilation

2016-10-09 Thread Tapani Pälli



On 10/08/2016 06:58 PM, Jason Ekstrand wrote:

FYI, we use ralloc for a lot more than just the glsl compiler so the
first few changes make me a bit nervous.  There was someone working on
making our driver more I undefined-memory-friendly but I don't know what
happened to those patches.


There's bunch of patches like that in this series:
https://lists.freedesktop.org/archives/mesa-dev/2016-June/120445.html

it looks like it just never landed as would have required more testing 
on misc drivers?




On Oct 8, 2016 3:58 AM, "Marek Olšák" > wrote:

Hi,

This patch series reduces the number of malloc calls in the GLSL
compiler by 63%. That leads to better compile times and less heap
thrashing.

It's done by switching memory allocations in the GLSL compiler to my
new linear allocator that allocates out of a fixed-sized buffer with
a monotonically increasing offset. If more buffers are needed, it
chains them.

The new allocator is used in all places where short-lived allocations
are used with a high number of malloc calls. The series also contains
other improvements not related to the new allocator that also improve
compile times. The results are below.

I tested my shader-db with shaders only being compiled to TGSI.
(noop gallium driver)


master + libc's malloc:

 real   0m54.182s
 user   3m33.640s
 sys0m0.620s
 maxmem 275 MB


master + jemalloc preloaded:

 real   0m45.044s
 user   2m56.356s
 sys0m1.652s
 maxmem 284 MB


the series + libc's malloc:

 real   0m46.221s
 user   3m2.080s
 sys0m0.544s
 maxmem 270 MB


the series + jemalloc preloaded:

 real   0m40.729s
 user   2m39.564s
 sys0m1.232s
 maxmem 284 MB


The series without jemalloc almost caught up with jemalloc + master.
However, jemalloc also benefits.

Current Mesa needs 54.182s and it drops to 40.729s with my series and
jemalloc. The total change in compile time is -25% if we incorporate
both. Without jemalloc, the difference is only -14.7%.

With radeonsi, the improvement is approx. slightly more than 1/2 of that
(if you add the LLVM time). However, radeonsi also has asynchronous
shader compilation hiding LLVM overhead in some cases, so it depends.

Drivers with faster compiler backends will benefit more than radeonsi,
but will probably not reach -25% or -14.7% (except softpipe, which uses
TGSI as-is).

The memory usage looks reasonable in all tested cases.

Note: One of the first patches moves memset from ralloc to rzalloc.
I tested and fixed the GLSL source -> TGSI path, but other codepaths
may break, and you need to use valgrind to find all uninitialized
variables that relied on ralloc doing memset (if there are any).

You can also find it here:
https://cgit.freedesktop.org/~mareko/mesa/log/?h=glsl-alloc-rework


Please review.

 src/compiler/glsl/ast.h |   4 +-
 src/compiler/glsl/ast_to_hir.cpp|   4 +-
 src/compiler/glsl/ast_type.cpp  |  13 ++-
 src/compiler/glsl/glcpp/glcpp-lex.l |   2 +-
 src/compiler/glsl/glcpp/glcpp-parse.y   | 203
+-
 src/compiler/glsl/glcpp/glcpp.h |   1 +
 src/compiler/glsl/glsl_lexer.ll |  16 +--
 src/compiler/glsl/glsl_parser.yy| 202
+++---
 src/compiler/glsl/glsl_parser_extras.cpp|   6 +-
 src/compiler/glsl/glsl_parser_extras.h  |   4 +-
 src/compiler/glsl/glsl_symbol_table.cpp |  19 ++--
 src/compiler/glsl/glsl_symbol_table.h   |   1 +
 src/compiler/glsl/ir.cpp|   4 +
 src/compiler/glsl/ir.h  |  13 ++-
 src/compiler/glsl/link_uniform_blocks.cpp   |   2 +-
 src/compiler/glsl/list.h|   2 +-
 src/compiler/glsl/lower_packed_varyings.cpp |   8 +-
 src/compiler/glsl/opt_constant_propagation.cpp  |  14 ++-
 src/compiler/glsl/opt_copy_propagation.cpp  |   7 +-
 src/compiler/glsl/opt_copy_propagation_elements.cpp |  19 ++--
 src/compiler/glsl/opt_dead_code_local.cpp   |  12 ++-
 src/compiler/glsl_types.cpp |  38 +--
 src/compiler/glsl_types.h   |   6 +-
 src/compiler/nir/nir.c  |   8 +-
 src/compiler/spirv/vtn_variables.c  |   3 +-
 src/gallium/drivers/freedreno/ir3/ir3.c |   2 +-
 src/gallium/drivers/vc4/vc4_cl.c|   2 +-
 

Re: [Mesa-dev] [PATCH 00/15] GLSL memory allocation rework for faster compilation

2016-10-08 Thread Edmondo Tommasina
Hi Marek

Series is
Tested-by: Edmondo Tommasina 

I've merged your series of patches on top of mesa git master and
tested on a Radeon RX 470. No regressions found.

OpenGL renderer string: Gallium 0.4 on AMD POLARIS10 (DRM 3.3.0 /
4.8.0-rc6, LLVM 3.9.0)
OpenGL core profile version string: 4.3 (Core Profile) Mesa
12.1.0-devel (git-e076df4)

Games tested:
* OpenMW
* The Witcher 2
* The Talos Principle
* Wasteland 2

Thanks and regards
edmondo


On Sat, Oct 8, 2016 at 12:58 PM, Marek Olšák  wrote:
> Hi,
>
> This patch series reduces the number of malloc calls in the GLSL
> compiler by 63%. That leads to better compile times and less heap
> thrashing.
>
> It's done by switching memory allocations in the GLSL compiler to my
> new linear allocator that allocates out of a fixed-sized buffer with
> a monotonically increasing offset. If more buffers are needed, it
> chains them.
>
> The new allocator is used in all places where short-lived allocations
> are used with a high number of malloc calls. The series also contains
> other improvements not related to the new allocator that also improve
> compile times. The results are below.
>
> I tested my shader-db with shaders only being compiled to TGSI.
> (noop gallium driver)
>
>
> master + libc's malloc:
>
>  real   0m54.182s
>  user   3m33.640s
>  sys0m0.620s
>  maxmem 275 MB
>
>
> master + jemalloc preloaded:
>
>  real   0m45.044s
>  user   2m56.356s
>  sys0m1.652s
>  maxmem 284 MB
>
>
> the series + libc's malloc:
>
>  real   0m46.221s
>  user   3m2.080s
>  sys0m0.544s
>  maxmem 270 MB
>
>
> the series + jemalloc preloaded:
>
>  real   0m40.729s
>  user   2m39.564s
>  sys0m1.232s
>  maxmem 284 MB
>
>
> The series without jemalloc almost caught up with jemalloc + master.
> However, jemalloc also benefits.
>
> Current Mesa needs 54.182s and it drops to 40.729s with my series and
> jemalloc. The total change in compile time is -25% if we incorporate
> both. Without jemalloc, the difference is only -14.7%.
>
> With radeonsi, the improvement is approx. slightly more than 1/2 of that
> (if you add the LLVM time). However, radeonsi also has asynchronous
> shader compilation hiding LLVM overhead in some cases, so it depends.
>
> Drivers with faster compiler backends will benefit more than radeonsi,
> but will probably not reach -25% or -14.7% (except softpipe, which uses
> TGSI as-is).
>
> The memory usage looks reasonable in all tested cases.
>
> Note: One of the first patches moves memset from ralloc to rzalloc.
> I tested and fixed the GLSL source -> TGSI path, but other codepaths
> may break, and you need to use valgrind to find all uninitialized
> variables that relied on ralloc doing memset (if there are any).
>
> You can also find it here:
> https://cgit.freedesktop.org/~mareko/mesa/log/?h=glsl-alloc-rework
>
> Please review.
>
>  src/compiler/glsl/ast.h |   4 +-
>  src/compiler/glsl/ast_to_hir.cpp|   4 +-
>  src/compiler/glsl/ast_type.cpp  |  13 ++-
>  src/compiler/glsl/glcpp/glcpp-lex.l |   2 +-
>  src/compiler/glsl/glcpp/glcpp-parse.y   | 203 
> +-
>  src/compiler/glsl/glcpp/glcpp.h |   1 +
>  src/compiler/glsl/glsl_lexer.ll |  16 +--
>  src/compiler/glsl/glsl_parser.yy| 202 
> +++---
>  src/compiler/glsl/glsl_parser_extras.cpp|   6 +-
>  src/compiler/glsl/glsl_parser_extras.h  |   4 +-
>  src/compiler/glsl/glsl_symbol_table.cpp |  19 ++--
>  src/compiler/glsl/glsl_symbol_table.h   |   1 +
>  src/compiler/glsl/ir.cpp|   4 +
>  src/compiler/glsl/ir.h  |  13 ++-
>  src/compiler/glsl/link_uniform_blocks.cpp   |   2 +-
>  src/compiler/glsl/list.h|   2 +-
>  src/compiler/glsl/lower_packed_varyings.cpp |   8 +-
>  src/compiler/glsl/opt_constant_propagation.cpp  |  14 ++-
>  src/compiler/glsl/opt_copy_propagation.cpp  |   7 +-
>  src/compiler/glsl/opt_copy_propagation_elements.cpp |  19 ++--
>  src/compiler/glsl/opt_dead_code_local.cpp   |  12 ++-
>  src/compiler/glsl_types.cpp |  38 +--
>  src/compiler/glsl_types.h   |   6 +-
>  src/compiler/nir/nir.c  |   8 +-
>  src/compiler/spirv/vtn_variables.c  |   3 +-
>  src/gallium/drivers/freedreno/ir3/ir3.c |   2 +-
>  src/gallium/drivers/vc4/vc4_cl.c|   2 +-
>  src/gallium/drivers/vc4/vc4_program.c   |   2 +-
>  src/gallium/drivers/vc4/vc4_simulator.c |   5 +-
>  src/mesa/drivers/dri/i965/brw_state_batch.c |   5 +-
>  src/util/ralloc.c   | 392 
> 

Re: [Mesa-dev] [PATCH 00/15] GLSL memory allocation rework for faster compilation

2016-10-08 Thread Jason Ekstrand
On Sat, Oct 8, 2016 at 9:30 AM, Marek Olšák  wrote:

> On Sat, Oct 8, 2016 at 5:58 PM, Jason Ekstrand 
> wrote:
> > FYI, we use ralloc for a lot more than just the glsl compiler so the
> first
> > few changes make me a bit nervous.  There was someone working on making
> our
> > driver more I undefined-memory-friendly but I don't know what happened to
> > those patches.
>
> shader-db + valgrind can be used to track down uninitialized variables.
>
> After that's done, test suites (piglit etc.) can be used to fix any
> remaining uninitialized variables (run test suites first and then run
> regressed tests with valgrind).
>
> That's how I did it for GLSL.
>

Yeah, that's more-or-less how I would do it.  I just didn't want you
pushing anything without us fixing things. :)


> Alternatively, the following can be done: All non-GLSL code can be
> switched from ralloc to rzalloc. All non-GLSL C++ classes can be
> switched from DECLARE_RALLOC_CXX_OPERATORS to
> DECLARE_RZALLOC_CXX_OPERATORS.
>
> Marek
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 00/15] GLSL memory allocation rework for faster compilation

2016-10-08 Thread Marek Olšák
On Sat, Oct 8, 2016 at 5:58 PM, Jason Ekstrand  wrote:
> FYI, we use ralloc for a lot more than just the glsl compiler so the first
> few changes make me a bit nervous.  There was someone working on making our
> driver more I undefined-memory-friendly but I don't know what happened to
> those patches.

shader-db + valgrind can be used to track down uninitialized variables.

After that's done, test suites (piglit etc.) can be used to fix any
remaining uninitialized variables (run test suites first and then run
regressed tests with valgrind).

That's how I did it for GLSL.

Alternatively, the following can be done: All non-GLSL code can be
switched from ralloc to rzalloc. All non-GLSL C++ classes can be
switched from DECLARE_RALLOC_CXX_OPERATORS to
DECLARE_RZALLOC_CXX_OPERATORS.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 00/15] GLSL memory allocation rework for faster compilation

2016-10-08 Thread Jason Ekstrand
FYI, we use ralloc for a lot more than just the glsl compiler so the first
few changes make me a bit nervous.  There was someone working on making our
driver more I undefined-memory-friendly but I don't know what happened to
those patches.

On Oct 8, 2016 3:58 AM, "Marek Olšák"  wrote:

Hi,

This patch series reduces the number of malloc calls in the GLSL
compiler by 63%. That leads to better compile times and less heap
thrashing.

It's done by switching memory allocations in the GLSL compiler to my
new linear allocator that allocates out of a fixed-sized buffer with
a monotonically increasing offset. If more buffers are needed, it
chains them.

The new allocator is used in all places where short-lived allocations
are used with a high number of malloc calls. The series also contains
other improvements not related to the new allocator that also improve
compile times. The results are below.

I tested my shader-db with shaders only being compiled to TGSI.
(noop gallium driver)


master + libc's malloc:

 real   0m54.182s
 user   3m33.640s
 sys0m0.620s
 maxmem 275 MB


master + jemalloc preloaded:

 real   0m45.044s
 user   2m56.356s
 sys0m1.652s
 maxmem 284 MB


the series + libc's malloc:

 real   0m46.221s
 user   3m2.080s
 sys0m0.544s
 maxmem 270 MB


the series + jemalloc preloaded:

 real   0m40.729s
 user   2m39.564s
 sys0m1.232s
 maxmem 284 MB


The series without jemalloc almost caught up with jemalloc + master.
However, jemalloc also benefits.

Current Mesa needs 54.182s and it drops to 40.729s with my series and
jemalloc. The total change in compile time is -25% if we incorporate
both. Without jemalloc, the difference is only -14.7%.

With radeonsi, the improvement is approx. slightly more than 1/2 of that
(if you add the LLVM time). However, radeonsi also has asynchronous
shader compilation hiding LLVM overhead in some cases, so it depends.

Drivers with faster compiler backends will benefit more than radeonsi,
but will probably not reach -25% or -14.7% (except softpipe, which uses
TGSI as-is).

The memory usage looks reasonable in all tested cases.

Note: One of the first patches moves memset from ralloc to rzalloc.
I tested and fixed the GLSL source -> TGSI path, but other codepaths
may break, and you need to use valgrind to find all uninitialized
variables that relied on ralloc doing memset (if there are any).

You can also find it here:
https://cgit.freedesktop.org/~mareko/mesa/log/?h=glsl-alloc-rework

Please review.

 src/compiler/glsl/ast.h |   4 +-
 src/compiler/glsl/ast_to_hir.cpp|   4 +-
 src/compiler/glsl/ast_type.cpp  |  13 ++-
 src/compiler/glsl/glcpp/glcpp-lex.l |   2 +-
 src/compiler/glsl/glcpp/glcpp-parse.y   | 203
+-
 src/compiler/glsl/glcpp/glcpp.h |   1 +
 src/compiler/glsl/glsl_lexer.ll |  16 +--
 src/compiler/glsl/glsl_parser.yy| 202
+++---
 src/compiler/glsl/glsl_parser_extras.cpp|   6 +-
 src/compiler/glsl/glsl_parser_extras.h  |   4 +-
 src/compiler/glsl/glsl_symbol_table.cpp |  19 ++--
 src/compiler/glsl/glsl_symbol_table.h   |   1 +
 src/compiler/glsl/ir.cpp|   4 +
 src/compiler/glsl/ir.h  |  13 ++-
 src/compiler/glsl/link_uniform_blocks.cpp   |   2 +-
 src/compiler/glsl/list.h|   2 +-
 src/compiler/glsl/lower_packed_varyings.cpp |   8 +-
 src/compiler/glsl/opt_constant_propagation.cpp  |  14 ++-
 src/compiler/glsl/opt_copy_propagation.cpp  |   7 +-
 src/compiler/glsl/opt_copy_propagation_elements.cpp |  19 ++--
 src/compiler/glsl/opt_dead_code_local.cpp   |  12 ++-
 src/compiler/glsl_types.cpp |  38 +--
 src/compiler/glsl_types.h   |   6 +-
 src/compiler/nir/nir.c  |   8 +-
 src/compiler/spirv/vtn_variables.c  |   3 +-
 src/gallium/drivers/freedreno/ir3/ir3.c |   2 +-
 src/gallium/drivers/vc4/vc4_cl.c|   2 +-
 src/gallium/drivers/vc4/vc4_program.c   |   2 +-
 src/gallium/drivers/vc4/vc4_simulator.c |   5 +-
 src/mesa/drivers/dri/i965/brw_state_batch.c |   5 +-
 src/util/ralloc.c   | 392
++---
 src/util/ralloc.h   |  93
--
 32 files changed, 782 insertions(+), 330 deletions(-)

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org

Re: [Mesa-dev] [PATCH 00/15] GLSL memory allocation rework for faster compilation

2016-10-08 Thread Edward O'Callaghan


On 10/09/2016 12:14 AM, Marek Olšák wrote:
> On Sat, Oct 8, 2016 at 2:48 PM, Edward O'Callaghan
>  wrote:
>> Patches 1-5 are,
>> Reviewed-by: Edward O'Callaghan 
>>
>> I think it would be reassuring if you could run a before/after complete
>> piglit run also though if you have not already?
> 
> The series was tested with piglit and GL CTS. The testing only covered
Good to hear yes, just wanted to check..

> GLSL source -> TGSI.
Sure, just the stress of the whole run is also reassuring with these
kinds of changes if you know what I mean..

I'll look in the morning at the other half but 1-5 LGTM.

Kind Regards,
Edward.

> 
> Marek
> 



signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 00/15] GLSL memory allocation rework for faster compilation

2016-10-08 Thread Marek Olšák
On Sat, Oct 8, 2016 at 2:48 PM, Edward O'Callaghan
 wrote:
> Patches 1-5 are,
> Reviewed-by: Edward O'Callaghan 
>
> I think it would be reassuring if you could run a before/after complete
> piglit run also though if you have not already?

The series was tested with piglit and GL CTS. The testing only covered
GLSL source -> TGSI.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 00/15] GLSL memory allocation rework for faster compilation

2016-10-08 Thread Edward O'Callaghan
Patches 1-5 are,
Reviewed-by: Edward O'Callaghan 

I think it would be reassuring if you could run a before/after complete
piglit run also though if you have not already?

On 10/08/2016 09:58 PM, Marek Olšák wrote:
> Hi,
> 
> This patch series reduces the number of malloc calls in the GLSL
> compiler by 63%. That leads to better compile times and less heap
> thrashing.
> 
> It's done by switching memory allocations in the GLSL compiler to my
> new linear allocator that allocates out of a fixed-sized buffer with
> a monotonically increasing offset. If more buffers are needed, it
> chains them.
> 
> The new allocator is used in all places where short-lived allocations
> are used with a high number of malloc calls. The series also contains
> other improvements not related to the new allocator that also improve
> compile times. The results are below.
> 
> I tested my shader-db with shaders only being compiled to TGSI.
> (noop gallium driver)
> 
> 
> master + libc's malloc:
> 
>  real 0m54.182s
>  user 3m33.640s
>  sys  0m0.620s
>  maxmem 275 MB
> 
> 
> master + jemalloc preloaded:
> 
>  real 0m45.044s
>  user 2m56.356s
>  sys  0m1.652s
>  maxmem 284 MB
> 
> 
> the series + libc's malloc:
> 
>  real 0m46.221s
>  user 3m2.080s
>  sys  0m0.544s
>  maxmem 270 MB
> 
> 
> the series + jemalloc preloaded:
> 
>  real 0m40.729s
>  user 2m39.564s
>  sys  0m1.232s
>  maxmem 284 MB
> 
> 
> The series without jemalloc almost caught up with jemalloc + master.
> However, jemalloc also benefits.
> 
> Current Mesa needs 54.182s and it drops to 40.729s with my series and
> jemalloc. The total change in compile time is -25% if we incorporate
> both. Without jemalloc, the difference is only -14.7%.
> 
> With radeonsi, the improvement is approx. slightly more than 1/2 of that
> (if you add the LLVM time). However, radeonsi also has asynchronous
> shader compilation hiding LLVM overhead in some cases, so it depends.
> 
> Drivers with faster compiler backends will benefit more than radeonsi,
> but will probably not reach -25% or -14.7% (except softpipe, which uses
> TGSI as-is).
> 
> The memory usage looks reasonable in all tested cases.
> 
> Note: One of the first patches moves memset from ralloc to rzalloc.
> I tested and fixed the GLSL source -> TGSI path, but other codepaths
> may break, and you need to use valgrind to find all uninitialized
> variables that relied on ralloc doing memset (if there are any).
> 
> You can also find it here:
> https://cgit.freedesktop.org/~mareko/mesa/log/?h=glsl-alloc-rework
> 
> Please review.
> 
>  src/compiler/glsl/ast.h |   4 +-
>  src/compiler/glsl/ast_to_hir.cpp|   4 +-
>  src/compiler/glsl/ast_type.cpp  |  13 ++-
>  src/compiler/glsl/glcpp/glcpp-lex.l |   2 +-
>  src/compiler/glsl/glcpp/glcpp-parse.y   | 203 
> +-
>  src/compiler/glsl/glcpp/glcpp.h |   1 +
>  src/compiler/glsl/glsl_lexer.ll |  16 +--
>  src/compiler/glsl/glsl_parser.yy| 202 
> +++---
>  src/compiler/glsl/glsl_parser_extras.cpp|   6 +-
>  src/compiler/glsl/glsl_parser_extras.h  |   4 +-
>  src/compiler/glsl/glsl_symbol_table.cpp |  19 ++--
>  src/compiler/glsl/glsl_symbol_table.h   |   1 +
>  src/compiler/glsl/ir.cpp|   4 +
>  src/compiler/glsl/ir.h  |  13 ++-
>  src/compiler/glsl/link_uniform_blocks.cpp   |   2 +-
>  src/compiler/glsl/list.h|   2 +-
>  src/compiler/glsl/lower_packed_varyings.cpp |   8 +-
>  src/compiler/glsl/opt_constant_propagation.cpp  |  14 ++-
>  src/compiler/glsl/opt_copy_propagation.cpp  |   7 +-
>  src/compiler/glsl/opt_copy_propagation_elements.cpp |  19 ++--
>  src/compiler/glsl/opt_dead_code_local.cpp   |  12 ++-
>  src/compiler/glsl_types.cpp |  38 +--
>  src/compiler/glsl_types.h   |   6 +-
>  src/compiler/nir/nir.c  |   8 +-
>  src/compiler/spirv/vtn_variables.c  |   3 +-
>  src/gallium/drivers/freedreno/ir3/ir3.c |   2 +-
>  src/gallium/drivers/vc4/vc4_cl.c|   2 +-
>  src/gallium/drivers/vc4/vc4_program.c   |   2 +-
>  src/gallium/drivers/vc4/vc4_simulator.c |   5 +-
>  src/mesa/drivers/dri/i965/brw_state_batch.c |   5 +-
>  src/util/ralloc.c   | 392 
> ++---
>  src/util/ralloc.h   |  93 --
>  32 files changed, 782 insertions(+), 330 deletions(-)
> 
> Marek
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> 

[Mesa-dev] [PATCH 00/15] GLSL memory allocation rework for faster compilation

2016-10-08 Thread Marek Olšák
Hi,

This patch series reduces the number of malloc calls in the GLSL
compiler by 63%. That leads to better compile times and less heap
thrashing.

It's done by switching memory allocations in the GLSL compiler to my
new linear allocator that allocates out of a fixed-sized buffer with
a monotonically increasing offset. If more buffers are needed, it
chains them.

The new allocator is used in all places where short-lived allocations
are used with a high number of malloc calls. The series also contains
other improvements not related to the new allocator that also improve
compile times. The results are below.

I tested my shader-db with shaders only being compiled to TGSI.
(noop gallium driver)


master + libc's malloc:

 real   0m54.182s
 user   3m33.640s
 sys0m0.620s
 maxmem 275 MB


master + jemalloc preloaded:

 real   0m45.044s
 user   2m56.356s
 sys0m1.652s
 maxmem 284 MB


the series + libc's malloc:

 real   0m46.221s
 user   3m2.080s
 sys0m0.544s
 maxmem 270 MB


the series + jemalloc preloaded:

 real   0m40.729s
 user   2m39.564s
 sys0m1.232s
 maxmem 284 MB


The series without jemalloc almost caught up with jemalloc + master.
However, jemalloc also benefits.

Current Mesa needs 54.182s and it drops to 40.729s with my series and
jemalloc. The total change in compile time is -25% if we incorporate
both. Without jemalloc, the difference is only -14.7%.

With radeonsi, the improvement is approx. slightly more than 1/2 of that
(if you add the LLVM time). However, radeonsi also has asynchronous
shader compilation hiding LLVM overhead in some cases, so it depends.

Drivers with faster compiler backends will benefit more than radeonsi,
but will probably not reach -25% or -14.7% (except softpipe, which uses
TGSI as-is).

The memory usage looks reasonable in all tested cases.

Note: One of the first patches moves memset from ralloc to rzalloc.
I tested and fixed the GLSL source -> TGSI path, but other codepaths
may break, and you need to use valgrind to find all uninitialized
variables that relied on ralloc doing memset (if there are any).

You can also find it here:
https://cgit.freedesktop.org/~mareko/mesa/log/?h=glsl-alloc-rework

Please review.

 src/compiler/glsl/ast.h |   4 +-
 src/compiler/glsl/ast_to_hir.cpp|   4 +-
 src/compiler/glsl/ast_type.cpp  |  13 ++-
 src/compiler/glsl/glcpp/glcpp-lex.l |   2 +-
 src/compiler/glsl/glcpp/glcpp-parse.y   | 203 
+-
 src/compiler/glsl/glcpp/glcpp.h |   1 +
 src/compiler/glsl/glsl_lexer.ll |  16 +--
 src/compiler/glsl/glsl_parser.yy| 202 
+++---
 src/compiler/glsl/glsl_parser_extras.cpp|   6 +-
 src/compiler/glsl/glsl_parser_extras.h  |   4 +-
 src/compiler/glsl/glsl_symbol_table.cpp |  19 ++--
 src/compiler/glsl/glsl_symbol_table.h   |   1 +
 src/compiler/glsl/ir.cpp|   4 +
 src/compiler/glsl/ir.h  |  13 ++-
 src/compiler/glsl/link_uniform_blocks.cpp   |   2 +-
 src/compiler/glsl/list.h|   2 +-
 src/compiler/glsl/lower_packed_varyings.cpp |   8 +-
 src/compiler/glsl/opt_constant_propagation.cpp  |  14 ++-
 src/compiler/glsl/opt_copy_propagation.cpp  |   7 +-
 src/compiler/glsl/opt_copy_propagation_elements.cpp |  19 ++--
 src/compiler/glsl/opt_dead_code_local.cpp   |  12 ++-
 src/compiler/glsl_types.cpp |  38 +--
 src/compiler/glsl_types.h   |   6 +-
 src/compiler/nir/nir.c  |   8 +-
 src/compiler/spirv/vtn_variables.c  |   3 +-
 src/gallium/drivers/freedreno/ir3/ir3.c |   2 +-
 src/gallium/drivers/vc4/vc4_cl.c|   2 +-
 src/gallium/drivers/vc4/vc4_program.c   |   2 +-
 src/gallium/drivers/vc4/vc4_simulator.c |   5 +-
 src/mesa/drivers/dri/i965/brw_state_batch.c |   5 +-
 src/util/ralloc.c   | 392 
++---
 src/util/ralloc.h   |  93 --
 32 files changed, 782 insertions(+), 330 deletions(-)

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev