Patches 1-5 are,
Reviewed-by: Edward O'Callaghan <funfunc...@folklore1984.net>

I think it would be reassuring if you could run a before/after complete
piglit run also though if you have not already?

On 10/08/2016 09:58 PM, Marek Olšák wrote:
> Hi,
> 
> This patch series reduces the number of malloc calls in the GLSL
> compiler by 63%. That leads to better compile times and less heap
> thrashing.
> 
> It's done by switching memory allocations in the GLSL compiler to my
> new linear allocator that allocates out of a fixed-sized buffer with
> a monotonically increasing offset. If more buffers are needed, it
> chains them.
> 
> The new allocator is used in all places where short-lived allocations
> are used with a high number of malloc calls. The series also contains
> other improvements not related to the new allocator that also improve
> compile times. The results are below.
> 
> I tested my shader-db with shaders only being compiled to TGSI.
> (noop gallium driver)
> 
> 
> master + libc's malloc:
> 
>  real 0m54.182s
>  user 3m33.640s
>  sys  0m0.620s
>  maxmem 275 MB
> 
> 
> master + jemalloc preloaded:
> 
>  real 0m45.044s
>  user 2m56.356s
>  sys  0m1.652s
>  maxmem 284 MB
> 
> 
> the series + libc's malloc:
> 
>  real 0m46.221s
>  user 3m2.080s
>  sys  0m0.544s
>  maxmem 270 MB
> 
> 
> the series + jemalloc preloaded:
> 
>  real 0m40.729s
>  user 2m39.564s
>  sys  0m1.232s
>  maxmem 284 MB
> 
> 
> The series without jemalloc almost caught up with jemalloc + master.
> However, jemalloc also benefits.
> 
> Current Mesa needs 54.182s and it drops to 40.729s with my series and
> jemalloc. The total change in compile time is -25% if we incorporate
> both. Without jemalloc, the difference is only -14.7%.
> 
> With radeonsi, the improvement is approx. slightly more than 1/2 of that
> (if you add the LLVM time). However, radeonsi also has asynchronous
> shader compilation hiding LLVM overhead in some cases, so it depends.
> 
> Drivers with faster compiler backends will benefit more than radeonsi,
> but will probably not reach -25% or -14.7% (except softpipe, which uses
> TGSI as-is).
> 
> The memory usage looks reasonable in all tested cases.
> 
> Note: One of the first patches moves memset from ralloc to rzalloc.
> I tested and fixed the GLSL source -> TGSI path, but other codepaths
> may break, and you need to use valgrind to find all uninitialized
> variables that relied on ralloc doing memset (if there are any).
> 
> You can also find it here:
> https://cgit.freedesktop.org/~mareko/mesa/log/?h=glsl-alloc-rework
> 
> Please review.
> 
>  src/compiler/glsl/ast.h                             |   4 +-
>  src/compiler/glsl/ast_to_hir.cpp                    |   4 +-
>  src/compiler/glsl/ast_type.cpp                      |  13 ++-
>  src/compiler/glsl/glcpp/glcpp-lex.l                 |   2 +-
>  src/compiler/glsl/glcpp/glcpp-parse.y               | 203 
> +++++++++++++++++---------------------
>  src/compiler/glsl/glcpp/glcpp.h                     |   1 +
>  src/compiler/glsl/glsl_lexer.ll                     |  16 +--
>  src/compiler/glsl/glsl_parser.yy                    | 202 
> +++++++++++++++++++-------------------
>  src/compiler/glsl/glsl_parser_extras.cpp            |   6 +-
>  src/compiler/glsl/glsl_parser_extras.h              |   4 +-
>  src/compiler/glsl/glsl_symbol_table.cpp             |  19 ++--
>  src/compiler/glsl/glsl_symbol_table.h               |   1 +
>  src/compiler/glsl/ir.cpp                            |   4 +
>  src/compiler/glsl/ir.h                              |  13 ++-
>  src/compiler/glsl/link_uniform_blocks.cpp           |   2 +-
>  src/compiler/glsl/list.h                            |   2 +-
>  src/compiler/glsl/lower_packed_varyings.cpp         |   8 +-
>  src/compiler/glsl/opt_constant_propagation.cpp      |  14 ++-
>  src/compiler/glsl/opt_copy_propagation.cpp          |   7 +-
>  src/compiler/glsl/opt_copy_propagation_elements.cpp |  19 ++--
>  src/compiler/glsl/opt_dead_code_local.cpp           |  12 ++-
>  src/compiler/glsl_types.cpp                         |  38 +------
>  src/compiler/glsl_types.h                           |   6 +-
>  src/compiler/nir/nir.c                              |   8 +-
>  src/compiler/spirv/vtn_variables.c                  |   3 +-
>  src/gallium/drivers/freedreno/ir3/ir3.c             |   2 +-
>  src/gallium/drivers/vc4/vc4_cl.c                    |   2 +-
>  src/gallium/drivers/vc4/vc4_program.c               |   2 +-
>  src/gallium/drivers/vc4/vc4_simulator.c             |   5 +-
>  src/mesa/drivers/dri/i965/brw_state_batch.c         |   5 +-
>  src/util/ralloc.c                                   | 392 
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++---
>  src/util/ralloc.h                                   |  93 ++++++++++++++++--
>  32 files changed, 782 insertions(+), 330 deletions(-)
> 
> Marek
> _______________________________________________
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 

Attachment: signature.asc
Description: OpenPGP digital signature

_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reply via email to