Re: [Mesa-dev] [PATCH V2] util/disk_cache: compress individual cache entries
On 05/04/17 08:23, Brian Paul wrote: On 03/04/2017 07:12 AM, Emil Velikov wrote: On 2 March 2017 at 21:52, Timothy Arceriwrote: On 03/03/17 01:49, Emil Velikov wrote: Hi Tim, On 2 March 2017 at 01:36, Timothy Arceri wrote: This reduces the cache size for Deus Ex from ~160M to ~30M for radeonsi. I'm also seeing the following improvements in minimum fps in the Shadow of Mordor benchmark: no-cache:~10fps with-cache-no-compression: ~15fps with-cache-and-compression: ~20fps Note the with cache results are from the second run after closing and opening the game to avoid the in-memory cache. Since we only really care about decompression I went with Z_BEST_COMPRESSION as suggested on irc by Steinar H. Gunderson who has benchmarked decompression speeds. Attempting to side-step the "which compression is best" topic, I'll just mention: zlib has been around for a long time than many others so, a) chances are smaller that vendors that ship their own, but even if they do b) the API should be stable across the system and bundled one. If not we can reconsider if things get hairy ;-) A couple of small suggestions below. +ZLIB_REQUIRED=1.2.8 Any particular reason behind this version - afaict it's released in 2013 and I'm wondering if some distros may be slow/missing it. It's what's shipped with Fedora and therefore what I tested with. If distros are lagging behind I don't think this is a problem we need to be concerned with, it may prompt them to upgrade which I don't think is a bad thing. I was thinking about Debian and friends, which tend to be slower than others. From what I can tell they rarely consider external factors as a reason to update. That aside, they have 1.2.7 for "oldstable" and 1.2.8 for everything else so everything's fine. FWIW, RHEL 7.2 (haven't looked at 7.3 yet) only has zlib 1.2.7. Mesa builds fine with 1.2.7 if I override ZLIB_CFLAGS and ZLIB_LIBS, but I haven't tested the shader cache. Can we lower the check to 1.2.7? Otherwise, I guess I could hack around it in our build script. Hi Brian, I meant to reply to this earlier (was reminded by recent discussion on IRC). It should be find to lower this 1.2.7 Tim -Brian ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH V2] util/disk_cache: compress individual cache entries
On 03/04/2017 07:12 AM, Emil Velikov wrote: On 2 March 2017 at 21:52, Timothy Arceriwrote: On 03/03/17 01:49, Emil Velikov wrote: Hi Tim, On 2 March 2017 at 01:36, Timothy Arceri wrote: This reduces the cache size for Deus Ex from ~160M to ~30M for radeonsi. I'm also seeing the following improvements in minimum fps in the Shadow of Mordor benchmark: no-cache:~10fps with-cache-no-compression: ~15fps with-cache-and-compression: ~20fps Note the with cache results are from the second run after closing and opening the game to avoid the in-memory cache. Since we only really care about decompression I went with Z_BEST_COMPRESSION as suggested on irc by Steinar H. Gunderson who has benchmarked decompression speeds. Attempting to side-step the "which compression is best" topic, I'll just mention: zlib has been around for a long time than many others so, a) chances are smaller that vendors that ship their own, but even if they do b) the API should be stable across the system and bundled one. If not we can reconsider if things get hairy ;-) A couple of small suggestions below. +ZLIB_REQUIRED=1.2.8 Any particular reason behind this version - afaict it's released in 2013 and I'm wondering if some distros may be slow/missing it. It's what's shipped with Fedora and therefore what I tested with. If distros are lagging behind I don't think this is a problem we need to be concerned with, it may prompt them to upgrade which I don't think is a bad thing. I was thinking about Debian and friends, which tend to be slower than others. From what I can tell they rarely consider external factors as a reason to update. That aside, they have 1.2.7 for "oldstable" and 1.2.8 for everything else so everything's fine. FWIW, RHEL 7.2 (haven't looked at 7.3 yet) only has zlib 1.2.7. Mesa builds fine with 1.2.7 if I override ZLIB_CFLAGS and ZLIB_LIBS, but I haven't tested the shader cache. Can we lower the check to 1.2.7? Otherwise, I guess I could hack around it in our build script. -Brian ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH V2] util/disk_cache: compress individual cache entries
On 2 March 2017 at 21:52, Timothy Arceriwrote: > > On 03/03/17 01:49, Emil Velikov wrote: >> >> Hi Tim, >> >> On 2 March 2017 at 01:36, Timothy Arceri wrote: >>> >>> This reduces the cache size for Deus Ex from ~160M to ~30M for >>> radeonsi. >>> >>> I'm also seeing the following improvements in minimum fps in the >>> Shadow of Mordor benchmark: >>> >>> no-cache:~10fps >>> with-cache-no-compression: ~15fps >>> with-cache-and-compression: ~20fps >>> >>> Note the with cache results are from the second run after closing >>> and opening the game to avoid the in-memory cache. >>> >>> Since we only really care about decompression I went with >>> Z_BEST_COMPRESSION as suggested on irc by Steinar H. Gunderson >>> who has benchmarked decompression speeds. >>> >> Attempting to side-step the "which compression is best" topic, I'll >> just mention: >> zlib has been around for a long time than many others so, >> a) chances are smaller that vendors that ship their own, but even if they >> do >> b) the API should be stable across the system and bundled one. >> >> If not we can reconsider if things get hairy ;-) >> >> A couple of small suggestions below. >> >> >>> +ZLIB_REQUIRED=1.2.8 >>> >> Any particular reason behind this version - afaict it's released in >> 2013 and I'm wondering if some distros may be slow/missing it. >> > > It's what's shipped with Fedora and therefore what I tested with. If distros > are lagging behind I don't think this is a problem we need to be concerned > with, it may prompt them to upgrade which I don't think is a bad thing. > I was thinking about Debian and friends, which tend to be slower than others. From what I can tell they rarely consider external factors as a reason to update. That aside, they have 1.2.7 for "oldstable" and 1.2.8 for everything else so everything's fine. -Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH V2] util/disk_cache: compress individual cache entries
On 03/03/17 01:49, Emil Velikov wrote: Hi Tim, On 2 March 2017 at 01:36, Timothy Arceriwrote: This reduces the cache size for Deus Ex from ~160M to ~30M for radeonsi. I'm also seeing the following improvements in minimum fps in the Shadow of Mordor benchmark: no-cache:~10fps with-cache-no-compression: ~15fps with-cache-and-compression: ~20fps Note the with cache results are from the second run after closing and opening the game to avoid the in-memory cache. Since we only really care about decompression I went with Z_BEST_COMPRESSION as suggested on irc by Steinar H. Gunderson who has benchmarked decompression speeds. Attempting to side-step the "which compression is best" topic, I'll just mention: zlib has been around for a long time than many others so, a) chances are smaller that vendors that ship their own, but even if they do b) the API should be stable across the system and bundled one. If not we can reconsider if things get hairy ;-) A couple of small suggestions below. +ZLIB_REQUIRED=1.2.8 Any particular reason behind this version - afaict it's released in 2013 and I'm wondering if some distros may be slow/missing it. It's what's shipped with Fedora and therefore what I tested with. If distros are lagging behind I don't think this is a problem we need to be concerned with, it may prompt them to upgrade which I don't think is a bad thing. @@ -36,20 +36,22 @@ libmesautil_la_CPPFLAGS = \ -I$(top_srcdir)/src/mesa \ -I$(top_srcdir)/src/gallium/include \ -I$(top_srcdir)/src/gallium/auxiliary \ $(VISIBILITY_CFLAGS) \ $(MSVC2013_COMPAT_CFLAGS) Add ZLIB_CFLAGS to the above. libmesautil_la_SOURCES = \ $(MESA_UTIL_FILES) \ $(MESA_UTIL_GENERATED_FILES) +libmesautil_la_LIBADD = -lz + Use ZLIB_LIBS here. Also do squash this hunk it. It should handle the Android builds. Thanks! --- a/src/util/Android.mk +++ b/src/util/Android.mk @@ -53,6 +53,8 @@ $(LOCAL_GENERATED_SOURCES): PRIVATE_CUSTOM_TOOL = $(PRIVATE_PYTHON) $^ > $@ $(LOCAL_GENERATED_SOURCES): $(intermediates)/%.c: $(LOCAL_PATH)/%.py $(transform-generated-source) +LOCAL_SHARED_LIBRARIES := libz + include $(MESA_COMMON_MK) include $(BUILD_STATIC_LIBRARY) @@ -88,5 +90,7 @@ $(LOCAL_GENERATED_SOURCES): PRIVATE_CUSTOM_TOOL = $(PRIVATE_PYTHON) $^ > $@ $(LOCAL_GENERATED_SOURCES): $(intermediates)/%.c: $(LOCAL_PATH)/%.py $(transform-generated-source) +LOCAL_SHARED_LIBRARIES := libz + Thanks Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH V2] util/disk_cache: compress individual cache entries
With Emil's suggestions applied: Acked-by: Marek OlšákWhoever wants a different compression algorithm can send a patch. Marek On Thu, Mar 2, 2017 at 2:36 AM, Timothy Arceri wrote: > This reduces the cache size for Deus Ex from ~160M to ~30M for > radeonsi. > > I'm also seeing the following improvements in minimum fps in the > Shadow of Mordor benchmark: > > no-cache:~10fps > with-cache-no-compression: ~15fps > with-cache-and-compression: ~20fps > > Note the with cache results are from the second run after closing > and opening the game to avoid the in-memory cache. > > Since we only really care about decompression I went with > Z_BEST_COMPRESSION as suggested on irc by Steinar H. Gunderson > who has benchmarked decompression speeds. > > V2: fix pointer increments for reading/writing cache entry > file data. > --- > configure.ac | 4 ++ > src/util/Makefile.am | 2 + > src/util/disk_cache.c | 173 > +++--- > 3 files changed, 156 insertions(+), 23 deletions(-) > > diff --git a/configure.ac b/configure.ac > index 890a379..9fde95f 100644 > --- a/configure.ac > +++ b/configure.ac > @@ -92,20 +92,21 @@ LIBVA_REQUIRED=0.38.0 > VDPAU_REQUIRED=1.1 > WAYLAND_REQUIRED=1.11 > XCB_REQUIRED=1.9.3 > XCBDRI2_REQUIRED=1.8 > XCBGLX_REQUIRED=1.8.1 > XDAMAGE_REQUIRED=1.1 > XSHMFENCE_REQUIRED=1.1 > XVMC_REQUIRED=1.0.6 > PYTHON_MAKO_REQUIRED=0.8.0 > LIBSENSORS_REQUIRED=4.0.0 > +ZLIB_REQUIRED=1.2.8 > > dnl LLVM versions > LLVM_REQUIRED_GALLIUM=3.3.0 > LLVM_REQUIRED_OPENCL=3.6.0 > LLVM_REQUIRED_R600=3.6.0 > LLVM_REQUIRED_RADEONSI=3.6.0 > LLVM_REQUIRED_RADV=3.9.0 > LLVM_REQUIRED_SWR=3.6.0 > > dnl Check for progs > @@ -777,20 +778,23 @@ darwin*) > AC_CHECK_FUNCS([clock_gettime], [CLOCK_LIB=], > [AC_CHECK_LIB([rt], [clock_gettime], [CLOCK_LIB=-lrt], > [AC_MSG_ERROR([Could not find > clock_gettime])])]) > AC_SUBST([CLOCK_LIB]) > ;; > esac > > dnl See if posix_memalign is available > AC_CHECK_FUNC([posix_memalign], [DEFINES="$DEFINES -DHAVE_POSIX_MEMALIGN"]) > > +dnl Check for zlib > +PKG_CHECK_MODULES([ZLIB], [zlib >= $ZLIB_REQUIRED]) > + > dnl Check for pthreads > AX_PTHREAD > if test "x$ax_pthread_ok" = xno; then > AC_MSG_ERROR([Building mesa on this platform requires pthreads]) > fi > dnl AX_PTHREADS leaves PTHREAD_LIBS empty for gcc and sets PTHREAD_CFLAGS > dnl to -pthread, which causes problems if we need -lpthread to appear in > dnl pkgconfig files. Since Android doesn't have a pthread lib, this check > dnl is not valid for that platform. > if test "x$android" = xno; then > diff --git a/src/util/Makefile.am b/src/util/Makefile.am > index ae50a3b..e46d893 100644 > --- a/src/util/Makefile.am > +++ b/src/util/Makefile.am > @@ -36,20 +36,22 @@ libmesautil_la_CPPFLAGS = \ > -I$(top_srcdir)/src/mesa \ > -I$(top_srcdir)/src/gallium/include \ > -I$(top_srcdir)/src/gallium/auxiliary \ > $(VISIBILITY_CFLAGS) \ > $(MSVC2013_COMPAT_CFLAGS) > > libmesautil_la_SOURCES = \ > $(MESA_UTIL_FILES) \ > $(MESA_UTIL_GENERATED_FILES) > > +libmesautil_la_LIBADD = -lz > + > roundeven_test_LDADD = -lm > > check_PROGRAMS = u_atomic_test roundeven_test > TESTS = $(check_PROGRAMS) > > BUILT_SOURCES = $(MESA_UTIL_GENERATED_FILES) > CLEANFILES = $(BUILT_SOURCES) > EXTRA_DIST = \ > format_srgb.py \ > SConscript \ > diff --git a/src/util/disk_cache.c b/src/util/disk_cache.c > index f8e9948..fafd329 100644 > --- a/src/util/disk_cache.c > +++ b/src/util/disk_cache.c > @@ -30,20 +30,21 @@ > #include > #include > #include > #include > #include > #include > #include > #include > #include > #include > +#include "zlib.h" > > #include "util/crc32.h" > #include "util/u_atomic.h" > #include "util/mesa-sha1.h" > #include "util/ralloc.h" > #include "main/errors.h" > > #include "disk_cache.h" > > /* Number of bits to mask off from a cache key to get an index. */ > @@ -638,30 +639,106 @@ disk_cache_remove(struct disk_cache *cache, cache_key > key) >return; > } > > unlink(filename); > free(filename); > > if (sb.st_size) >p_atomic_add(cache->size, - sb.st_size); > } > > +/* From the zlib docs: > + *"If the memory is available, buffers sizes on the order of 128K or 256K > + *bytes should be used." > + */ > +#define BUFSIZE 256 * 1024 > + > +/** > + * Compresses cache entry in memory and writes it to disk. Returns the size > + * of the data written to disk. > + */ > +static size_t > +deflate_and_write_to_disk(const void *in_data, size_t in_data_size, int dest, > + char *filename) > +{ > + unsigned char out[BUFSIZE]; > + > + /* allocate deflate state */ > + z_stream strm; > + strm.zalloc = Z_NULL; > + strm.zfree = Z_NULL; > + strm.opaque = Z_NULL; > +
Re: [Mesa-dev] [PATCH V2] util/disk_cache: compress individual cache entries
Hi Tim, On 2 March 2017 at 01:36, Timothy Arceriwrote: > This reduces the cache size for Deus Ex from ~160M to ~30M for > radeonsi. > > I'm also seeing the following improvements in minimum fps in the > Shadow of Mordor benchmark: > > no-cache:~10fps > with-cache-no-compression: ~15fps > with-cache-and-compression: ~20fps > > Note the with cache results are from the second run after closing > and opening the game to avoid the in-memory cache. > > Since we only really care about decompression I went with > Z_BEST_COMPRESSION as suggested on irc by Steinar H. Gunderson > who has benchmarked decompression speeds. > Attempting to side-step the "which compression is best" topic, I'll just mention: zlib has been around for a long time than many others so, a) chances are smaller that vendors that ship their own, but even if they do b) the API should be stable across the system and bundled one. If not we can reconsider if things get hairy ;-) A couple of small suggestions below. > +ZLIB_REQUIRED=1.2.8 > Any particular reason behind this version - afaict it's released in 2013 and I'm wondering if some distros may be slow/missing it. > @@ -36,20 +36,22 @@ libmesautil_la_CPPFLAGS = \ > -I$(top_srcdir)/src/mesa \ > -I$(top_srcdir)/src/gallium/include \ > -I$(top_srcdir)/src/gallium/auxiliary \ > $(VISIBILITY_CFLAGS) \ > $(MSVC2013_COMPAT_CFLAGS) Add ZLIB_CFLAGS to the above. > > libmesautil_la_SOURCES = \ > $(MESA_UTIL_FILES) \ > $(MESA_UTIL_GENERATED_FILES) > > +libmesautil_la_LIBADD = -lz > + Use ZLIB_LIBS here. Also do squash this hunk it. It should handle the Android builds. --- a/src/util/Android.mk +++ b/src/util/Android.mk @@ -53,6 +53,8 @@ $(LOCAL_GENERATED_SOURCES): PRIVATE_CUSTOM_TOOL = $(PRIVATE_PYTHON) $^ > $@ $(LOCAL_GENERATED_SOURCES): $(intermediates)/%.c: $(LOCAL_PATH)/%.py $(transform-generated-source) +LOCAL_SHARED_LIBRARIES := libz + include $(MESA_COMMON_MK) include $(BUILD_STATIC_LIBRARY) @@ -88,5 +90,7 @@ $(LOCAL_GENERATED_SOURCES): PRIVATE_CUSTOM_TOOL = $(PRIVATE_PYTHON) $^ > $@ $(LOCAL_GENERATED_SOURCES): $(intermediates)/%.c: $(LOCAL_PATH)/%.py $(transform-generated-source) +LOCAL_SHARED_LIBRARIES := libz + Thanks Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH V2] util/disk_cache: compress individual cache entries
This reduces the cache size for Deus Ex from ~160M to ~30M for radeonsi. I'm also seeing the following improvements in minimum fps in the Shadow of Mordor benchmark: no-cache:~10fps with-cache-no-compression: ~15fps with-cache-and-compression: ~20fps Note the with cache results are from the second run after closing and opening the game to avoid the in-memory cache. Since we only really care about decompression I went with Z_BEST_COMPRESSION as suggested on irc by Steinar H. Gunderson who has benchmarked decompression speeds. V2: fix pointer increments for reading/writing cache entry file data. --- configure.ac | 4 ++ src/util/Makefile.am | 2 + src/util/disk_cache.c | 173 +++--- 3 files changed, 156 insertions(+), 23 deletions(-) diff --git a/configure.ac b/configure.ac index 890a379..9fde95f 100644 --- a/configure.ac +++ b/configure.ac @@ -92,20 +92,21 @@ LIBVA_REQUIRED=0.38.0 VDPAU_REQUIRED=1.1 WAYLAND_REQUIRED=1.11 XCB_REQUIRED=1.9.3 XCBDRI2_REQUIRED=1.8 XCBGLX_REQUIRED=1.8.1 XDAMAGE_REQUIRED=1.1 XSHMFENCE_REQUIRED=1.1 XVMC_REQUIRED=1.0.6 PYTHON_MAKO_REQUIRED=0.8.0 LIBSENSORS_REQUIRED=4.0.0 +ZLIB_REQUIRED=1.2.8 dnl LLVM versions LLVM_REQUIRED_GALLIUM=3.3.0 LLVM_REQUIRED_OPENCL=3.6.0 LLVM_REQUIRED_R600=3.6.0 LLVM_REQUIRED_RADEONSI=3.6.0 LLVM_REQUIRED_RADV=3.9.0 LLVM_REQUIRED_SWR=3.6.0 dnl Check for progs @@ -777,20 +778,23 @@ darwin*) AC_CHECK_FUNCS([clock_gettime], [CLOCK_LIB=], [AC_CHECK_LIB([rt], [clock_gettime], [CLOCK_LIB=-lrt], [AC_MSG_ERROR([Could not find clock_gettime])])]) AC_SUBST([CLOCK_LIB]) ;; esac dnl See if posix_memalign is available AC_CHECK_FUNC([posix_memalign], [DEFINES="$DEFINES -DHAVE_POSIX_MEMALIGN"]) +dnl Check for zlib +PKG_CHECK_MODULES([ZLIB], [zlib >= $ZLIB_REQUIRED]) + dnl Check for pthreads AX_PTHREAD if test "x$ax_pthread_ok" = xno; then AC_MSG_ERROR([Building mesa on this platform requires pthreads]) fi dnl AX_PTHREADS leaves PTHREAD_LIBS empty for gcc and sets PTHREAD_CFLAGS dnl to -pthread, which causes problems if we need -lpthread to appear in dnl pkgconfig files. Since Android doesn't have a pthread lib, this check dnl is not valid for that platform. if test "x$android" = xno; then diff --git a/src/util/Makefile.am b/src/util/Makefile.am index ae50a3b..e46d893 100644 --- a/src/util/Makefile.am +++ b/src/util/Makefile.am @@ -36,20 +36,22 @@ libmesautil_la_CPPFLAGS = \ -I$(top_srcdir)/src/mesa \ -I$(top_srcdir)/src/gallium/include \ -I$(top_srcdir)/src/gallium/auxiliary \ $(VISIBILITY_CFLAGS) \ $(MSVC2013_COMPAT_CFLAGS) libmesautil_la_SOURCES = \ $(MESA_UTIL_FILES) \ $(MESA_UTIL_GENERATED_FILES) +libmesautil_la_LIBADD = -lz + roundeven_test_LDADD = -lm check_PROGRAMS = u_atomic_test roundeven_test TESTS = $(check_PROGRAMS) BUILT_SOURCES = $(MESA_UTIL_GENERATED_FILES) CLEANFILES = $(BUILT_SOURCES) EXTRA_DIST = \ format_srgb.py \ SConscript \ diff --git a/src/util/disk_cache.c b/src/util/disk_cache.c index f8e9948..fafd329 100644 --- a/src/util/disk_cache.c +++ b/src/util/disk_cache.c @@ -30,20 +30,21 @@ #include #include #include #include #include #include #include #include #include #include +#include "zlib.h" #include "util/crc32.h" #include "util/u_atomic.h" #include "util/mesa-sha1.h" #include "util/ralloc.h" #include "main/errors.h" #include "disk_cache.h" /* Number of bits to mask off from a cache key to get an index. */ @@ -638,30 +639,106 @@ disk_cache_remove(struct disk_cache *cache, cache_key key) return; } unlink(filename); free(filename); if (sb.st_size) p_atomic_add(cache->size, - sb.st_size); } +/* From the zlib docs: + *"If the memory is available, buffers sizes on the order of 128K or 256K + *bytes should be used." + */ +#define BUFSIZE 256 * 1024 + +/** + * Compresses cache entry in memory and writes it to disk. Returns the size + * of the data written to disk. + */ +static size_t +deflate_and_write_to_disk(const void *in_data, size_t in_data_size, int dest, + char *filename) +{ + unsigned char out[BUFSIZE]; + + /* allocate deflate state */ + z_stream strm; + strm.zalloc = Z_NULL; + strm.zfree = Z_NULL; + strm.opaque = Z_NULL; + strm.next_in = (uint8_t *) in_data; + strm.avail_in = in_data_size; + + int ret = deflateInit(, Z_BEST_COMPRESSION); + if (ret != Z_OK) + return 0; + + /* compress until end of in_data */ + size_t compressed_size = 0; + int flush; + do { + int remaining = in_data_size - BUFSIZE; + flush = remaining > 0 ? Z_NO_FLUSH : Z_FINISH; + in_data_size -= BUFSIZE; + + /* Run deflate() on input until the output buffer is not full (which + * means there is no more data to deflate). +