Re: [Mesa-dev] [PATCH] util: import sha1 implementation from OpenBSD
On 13 January 2017 at 21:59, Jason Ekstrandwrote: > Also, something I would like to see (maybe a follow-on patch?) would a > change to the mesa internal API to be able to put the SHA context on the > stack and not need to malloc it. It's not really a memory or cycle-saving > thing so much as it leaves one fewer cleanup paths you have to worry about. > Fun thing - I've already done that with my earlier experiments. The series is at https://patchwork.freedesktop.org/series/18510/ -Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] util: import sha1 implementation from OpenBSD
On Jan 18, 2017 8:07 AM, "Emil Velikov"wrote: On 18 January 2017 at 09:43, Timothy Arceri wrote: > I'm all for > importing an implementation I just wanted to be sure we make a good > choice as we will likely be stuck with it for a long time. > Would you like to give it some more testing/benchmarking, or I can consider this ack-by ;-) I shimmied through it and didn't see anything amiss. Acked-by: Jason Ekstrand ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] util: import sha1 implementation from OpenBSD
On 18 January 2017 at 09:43, Timothy Arceriwrote: > I'm all for > importing an implementation I just wanted to be sure we make a good > choice as we will likely be stuck with it for a long time. > Would you like to give it some more testing/benchmarking, or I can consider this ack-by ;-) Thanks Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] util: import sha1 implementation from OpenBSD
On 18 January 2017 at 09:57, Steven Newburywrote: > On Wed, 2017-01-18 at 20:43 +1100, Timothy Arceri wrote: >> On Wed, 2017-01-18 at 08:30 +, Steven Newbury wrote > [SNIP] >> > >> > Why not leave in a build time option for whichever is the fastest >> > (non- >> > OpenSSL) external SHA1 implementation but default or fallback to >> > whatever gets pulled into the tree? >> >> Because OpenSSL is not the only implementation used by games, leaving >> this option in just leaves room for future problems. I'm all for >> importing an implementation I just wanted to be sure we make a good >> choice as we will likely be stuck with it for a long time. >> > Okay, fair enough. Pressure needs to be kept on game producers not to > statically link system libraries though. I think that everyone here will agree with you. Esp. since we had similar 'fun' experiences with the C++ runtime, libudev and alike. Then again, I doubt this [mailing list] is the best medium for that. Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] util: import sha1 implementation from OpenBSD
On Wed, 2017-01-18 at 20:43 +1100, Timothy Arceri wrote: > On Wed, 2017-01-18 at 08:30 +, Steven Newbury wrote [SNIP] > > > > Why not leave in a build time option for whichever is the fastest > > (non- > > OpenSSL) external SHA1 implementation but default or fallback to > > whatever gets pulled into the tree? > > Because OpenSSL is not the only implementation used by games, leaving > this option in just leaves room for future problems. I'm all for > importing an implementation I just wanted to be sure we make a good > choice as we will likely be stuck with it for a long time. > Okay, fair enough. Pressure needs to be kept on game producers not to statically link system libraries though. SHA1 might be relatively unimportant, but it's part of a trend of importing/bundling code instead of using shared libraries that many projects find themselves pushed into to work around yet other projects doing the same thing. Mesa has always been quite good in this regard. signature.asc Description: This is a digitally signed message part ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] util: import sha1 implementation from OpenBSD
On Wed, 2017-01-18 at 08:30 +, Steven Newbury wrote: > On Mon, 2017-01-16 at 16:52 +0300, Vladislav Egorov wrote: > > > > 16.01.2017 16:13, Emil Velikov пишет: > > > Hi Vladislav, > > > > > > On 14 January 2017 at 01:50, Vladislav Egorov> > om > > > > wrote: > > > > 14.01.2017 01:45, Timothy Arceri пишет: > > > > > I'm asking for a chance to test before we jump in, its > > > > > probably > > > > > not a > > > > > big deal and I may even still be able to reduce my use of > > > > > hashing but > > > > > it would be nice to be given a few days to test and even > > > > > explore > > > > > alternatives before jumping on this implementation. > > > > > > > > A very quick and very dirty simple benchmark. I took shader- > > > > cache > > > > from > > > > github, branch shader-cache39. Then I've applied my > > > > preprocessor > > > > patch on > > > > top (because shader-cache still uses preprocessor even if the > > > > shader is > > > > cached and it was painful to see preprocessor taking more than > > > > half of the > > > > whole time). Then I've compiled it with openssl and with the > > > > Emil's patch. > > > > Full run on shader-db (300Mb+ of shaders) with shader-cache > > > > warmed up. It > > > > takes 78s, spends in libcrypto 0.27%. With OpenBSD SHA1 it runs > > > > approximately the same time, spends 0.53% in SHA1Transform() > > > > and > > > > other SHA1* > > > > functions. Subtest - 46Mb of shaders from Total War: Attila - > > > > 3.10s (for > > > > some reason, the cache works much faster on smaller subsets > > > > than > > > > on full > > > > shader-db). 1.08% were spent in libcrypto, 1.04% in > > > > sha1_block_data_order_avx2(). With OpenBSD 3.07s - 2.27% in > > > > SHA1Transform() > > > > and other SHA1* functions. > > > > > > > > > > Did you mean "shader-db" with the "shader-cache" references above > > > ? > > > I > > > cannot find any projects with the latter name on github. > > > > I meant Timothy's github: https://github.com/tarceri/Mesa > > > > > > Overall not that significant in context of shader-cache, but as > > > > expected, on > > > > Haswell it's twice slower than OpenSSL's AVX2 implementation. > > Why not leave in a build time option for whichever is the fastest > (non- > OpenSSL) external SHA1 implementation but default or fallback to > whatever gets pulled into the tree? Because OpenSSL is not the only implementation used by games, leaving this option in just leaves room for future problems. I'm all for importing an implementation I just wanted to be sure we make a good choice as we will likely be stuck with it for a long time. > That way integrators/distributions > get to decide if they want to support the extra dependency and it > removes almost all the variations and extra support machinery. > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] util: import sha1 implementation from OpenBSD
On Mon, 2017-01-16 at 16:52 +0300, Vladislav Egorov wrote: > > 16.01.2017 16:13, Emil Velikov пишет: > > Hi Vladislav, > > > > On 14 January 2017 at 01:50, Vladislav Egorov> > wrote: > > > 14.01.2017 01:45, Timothy Arceri пишет: > > > > I'm asking for a chance to test before we jump in, its probably > > > > not a > > > > big deal and I may even still be able to reduce my use of > > > > hashing but > > > > it would be nice to be given a few days to test and even > > > > explore > > > > alternatives before jumping on this implementation. > > > > > > A very quick and very dirty simple benchmark. I took shader-cache > > > from > > > github, branch shader-cache39. Then I've applied my preprocessor > > > patch on > > > top (because shader-cache still uses preprocessor even if the > > > shader is > > > cached and it was painful to see preprocessor taking more than > > > half of the > > > whole time). Then I've compiled it with openssl and with the > > > Emil's patch. > > > Full run on shader-db (300Mb+ of shaders) with shader-cache > > > warmed up. It > > > takes 78s, spends in libcrypto 0.27%. With OpenBSD SHA1 it runs > > > approximately the same time, spends 0.53% in SHA1Transform() and > > > other SHA1* > > > functions. Subtest - 46Mb of shaders from Total War: Attila - > > > 3.10s (for > > > some reason, the cache works much faster on smaller subsets than > > > on full > > > shader-db). 1.08% were spent in libcrypto, 1.04% in > > > sha1_block_data_order_avx2(). With OpenBSD 3.07s - 2.27% in > > > SHA1Transform() > > > and other SHA1* functions. > > > > > > > Did you mean "shader-db" with the "shader-cache" references above ? > > I > > cannot find any projects with the latter name on github. > > I meant Timothy's github: https://github.com/tarceri/Mesa > > > > Overall not that significant in context of shader-cache, but as > > > expected, on > > > Haswell it's twice slower than OpenSSL's AVX2 implementation. > > Why not leave in a build time option for whichever is the fastest (non- OpenSSL) external SHA1 implementation but default or fallback to whatever gets pulled into the tree? That way integrators/distributions get to decide if they want to support the extra dependency and it removes almost all the variations and extra support machinery. signature.asc Description: This is a digitally signed message part ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] util: import sha1 implementation from OpenBSD
16.01.2017 16:13, Emil Velikov пишет: Hi Vladislav, On 14 January 2017 at 01:50, Vladislav Egorovwrote: 14.01.2017 01:45, Timothy Arceri пишет: I'm asking for a chance to test before we jump in, its probably not a big deal and I may even still be able to reduce my use of hashing but it would be nice to be given a few days to test and even explore alternatives before jumping on this implementation. A very quick and very dirty simple benchmark. I took shader-cache from github, branch shader-cache39. Then I've applied my preprocessor patch on top (because shader-cache still uses preprocessor even if the shader is cached and it was painful to see preprocessor taking more than half of the whole time). Then I've compiled it with openssl and with the Emil's patch. Full run on shader-db (300Mb+ of shaders) with shader-cache warmed up. It takes 78s, spends in libcrypto 0.27%. With OpenBSD SHA1 it runs approximately the same time, spends 0.53% in SHA1Transform() and other SHA1* functions. Subtest - 46Mb of shaders from Total War: Attila - 3.10s (for some reason, the cache works much faster on smaller subsets than on full shader-db). 1.08% were spent in libcrypto, 1.04% in sha1_block_data_order_avx2(). With OpenBSD 3.07s - 2.27% in SHA1Transform() and other SHA1* functions. Did you mean "shader-db" with the "shader-cache" references above ? I cannot find any projects with the latter name on github. I meant Timothy's github: https://github.com/tarceri/Mesa Overall not that significant in context of shader-cache, but as expected, on Haswell it's twice slower than OpenSSL's AVX2 implementation. If I understood you correctly you're saying that despite this implementation being slower the actual total runtime remains unchanged. In the Attila case we even get a minimal improvement, although I'm inclined to discard that as within the error margin. That sounds interesting, but in all honesty I don't believe it's worth worrying unless we have usecase(s) where the runtime difference is noticeable. Thanks Emil Sure, it's just noise, it should be slower by 1%. When using hot shader cache it would be at most 1% difference, and at most ~2% difference on CPUs with hardware SHA like Goldmont. I am not sure about Vulkan. But yeah, it doesn't seem like there is something to worry about so far. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] util: import sha1 implementation from OpenBSD
On 14 January 2017 at 06:25, Jonathan Graywrote: > On Fri, Jan 13, 2017 at 04:51:31PM +, Emil Velikov wrote: >> From: Emil Velikov >> >> At the moment we support 5+ different implementations each with varying >> amount of bugs - from thread safely problems [1], to outright broken >> implementation(s) [2] >> >> In order to accommodate these we have 150+ lines of configure script and >> extra two configure toggles. Whist an actual implementation being >> ~200loc and our current compat wrapping ~250. >> >> Let's not forget that different people use different code paths, thus >> effectively makes it harder to test and debug since the default >> implementation is automatically detected. >> >> To minimise all these lovely experiences, import the "100% Public >> Domain" OpenBSD sha1 implementation. Clearly document any changes needed >> to get building correctly, since many/most of those can be upstreamed >> making future syncs easier. > > I had feared that this would somehow collide with the symbols > in libc but it seems to build and run xorg/glxgears at least > on broadwell with i965. > Amazing, thanks for testing ! > Patches for OpenBSD go to tech@ and you should look at how portable > openssh and libressl handle systems that lack functions like > explicit_bzero, autoconf detects systems that lack functions or are > known to have broken implementations and alternate versions are > provided. Damien Miller described how this is handled for ssh in > https://www.openbsd.org/papers/portability.pdf > https://www.openbsd.org/papers/auug2005-portability/ > > The attribute could also be checked in autoconf as is already done > for various other attributes. > > Other parts seem odd, posix defines size_t as being in sys/types.h > not stddef.h for example. > > u_int* are bsd types which predate c99 types, I could see an > argument being made for changing the types there but it > would likely have to cover all the other hashes as well, > not just sha1. > Thanks for the tips Jonathan. I'll follow suite as/if we get this patch merged. To answer your question - I've opted for the C99 variants (headers/types) since we build mesa/gallium on non-POSIX platforms. With the former being almost universally supported these days. Thanks Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] util: import sha1 implementation from OpenBSD
Hi Vladislav, On 14 January 2017 at 01:50, Vladislav Egorovwrote: > 14.01.2017 01:45, Timothy Arceri пишет: >> >> I'm asking for a chance to test before we jump in, its probably not a >> big deal and I may even still be able to reduce my use of hashing but >> it would be nice to be given a few days to test and even explore >> alternatives before jumping on this implementation. > > A very quick and very dirty simple benchmark. I took shader-cache from > github, branch shader-cache39. Then I've applied my preprocessor patch on > top (because shader-cache still uses preprocessor even if the shader is > cached and it was painful to see preprocessor taking more than half of the > whole time). Then I've compiled it with openssl and with the Emil's patch. > Full run on shader-db (300Mb+ of shaders) with shader-cache warmed up. It > takes 78s, spends in libcrypto 0.27%. With OpenBSD SHA1 it runs > approximately the same time, spends 0.53% in SHA1Transform() and other SHA1* > functions. Subtest - 46Mb of shaders from Total War: Attila - 3.10s (for > some reason, the cache works much faster on smaller subsets than on full > shader-db). 1.08% were spent in libcrypto, 1.04% in > sha1_block_data_order_avx2(). With OpenBSD 3.07s - 2.27% in SHA1Transform() > and other SHA1* functions. > Did you mean "shader-db" with the "shader-cache" references above ? I cannot find any projects with the latter name on github. > Overall not that significant in context of shader-cache, but as expected, on > Haswell it's twice slower than OpenSSL's AVX2 implementation. If I understood you correctly you're saying that despite this implementation being slower the actual total runtime remains unchanged. In the Attila case we even get a minimal improvement, although I'm inclined to discard that as within the error margin. That sounds interesting, but in all honesty I don't believe it's worth worrying unless we have usecase(s) where the runtime difference is noticeable. Thanks Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] util: import sha1 implementation from OpenBSD
Yes! Simplifies current situation quite a bit and makes Vulkan Android build possible, I'll submit that if/when this one lands. There's one thing that caused problems with Android, I've marked it below .. with that removed; Acked-by: Tapani PälliOn 01/13/2017 06:51 PM, Emil Velikov wrote: From: Emil Velikov 8< + +__BEGIN_DECLS +void SHA1Init(SHA1_CTX *); +void SHA1Pad(SHA1_CTX *); +void SHA1Transform(uint32_t [5], const uint8_t [SHA1_BLOCK_LENGTH]); +void SHA1Update(SHA1_CTX *, const uint8_t *, size_t); +void SHA1Final(uint8_t [SHA1_DIGEST_LENGTH], SHA1_CTX *); +__END_DECLS __BEGIN_DECLS and __END_DECLS should be either removed or ifdeffed for Android, it does not exist. It's not used previously in Mesa so I would vote for removal. Thanks; // Tapani ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] util: import sha1 implementation from OpenBSD
On Sat, 2017-01-14 at 04:50 +0300, Vladislav Egorov wrote: > 14.01.2017 01:45, Timothy Arceri пишет: > > I'm asking for a chance to test before we jump in, its probably not > > a > > big deal and I may even still be able to reduce my use of hashing > > but > > it would be nice to be given a few days to test and even explore > > alternatives before jumping on this implementation. > > A very quick and very dirty simple benchmark. I took shader-cache > from > github, branch shader-cache39. Then I've applied my preprocessor > patch > on top (because shader-cache still uses preprocessor even if the > shader > is cached and it was painful to see preprocessor taking more than > half > of the whole time). Yeah we shouldn't need to do that, might take a look at this sometime. > Then I've compiled it with openssl and with the > Emil's patch. Full run on shader-db (300Mb+ of shaders) with > shader-cache warmed up. It takes 78s, spends in libcrypto 0.27%. > With > OpenBSD SHA1 it runs approximately the same time, spends 0.53% in > SHA1Transform() and other SHA1* functions. Subtest - 46Mb of shaders > from Total War: Attila - 3.10s (for some reason, the cache works > much > faster on smaller subsets than on full shader-db). 1.08% were spent > in > libcrypto, 1.04% in sha1_block_data_order_avx2(). With OpenBSD 3.07s > - > 2.27% in SHA1Transform() and other SHA1* functions. > > Overall not that significant in context of shader-cache, but as > expected, on Haswell it's twice slower than OpenSSL's AVX2 > implementation. Thanks for testing. That doesn't seem too bad. Also now that I think about it in the draw path we don't really hash much its just really the in-memory cache keys and a few other things which should be nice an quick. > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] util: import sha1 implementation from OpenBSD
On Fri, Jan 13, 2017 at 04:51:31PM +, Emil Velikov wrote: > From: Emil Velikov> > At the moment we support 5+ different implementations each with varying > amount of bugs - from thread safely problems [1], to outright broken > implementation(s) [2] > > In order to accommodate these we have 150+ lines of configure script and > extra two configure toggles. Whist an actual implementation being > ~200loc and our current compat wrapping ~250. > > Let's not forget that different people use different code paths, thus > effectively makes it harder to test and debug since the default > implementation is automatically detected. > > To minimise all these lovely experiences, import the "100% Public > Domain" OpenBSD sha1 implementation. Clearly document any changes needed > to get building correctly, since many/most of those can be upstreamed > making future syncs easier. I had feared that this would somehow collide with the symbols in libc but it seems to build and run xorg/glxgears at least on broadwell with i965. Patches for OpenBSD go to tech@ and you should look at how portable openssh and libressl handle systems that lack functions like explicit_bzero, autoconf detects systems that lack functions or are known to have broken implementations and alternate versions are provided. Damien Miller described how this is handled for ssh in https://www.openbsd.org/papers/portability.pdf https://www.openbsd.org/papers/auug2005-portability/ The attribute could also be checked in autoconf as is already done for various other attributes. Other parts seem odd, posix defines size_t as being in sys/types.h not stddef.h for example. u_int* are bsd types which predate c99 types, I could see an argument being made for changing the types there but it would likely have to cover all the other hashes as well, not just sha1. > > As an added bonus this will avoid all the 'fun' experiences trying to > integrate it with the Android and SCons builds. > > Bugzilla [1]: https://bugs.freedesktop.org/show_bug.cgi?id=94904 > Bugzilla [2]: https://bugs.freedesktop.org/show_bug.cgi?id=97967 > Cc: Mark Janes > Cc: Vinson Lee > Cc: Tapani P??lli > Cc: Jonathan Gray > Signed-off-by: Emil Velikov > --- > configure.ac | 161 +-- > src/compiler/glsl/tests/cache_test.c | 5 - > src/mesa/main/shaderapi.c| 6 - > src/util/Makefile.am | 3 - > src/util/Makefile.sources| 2 + > src/util/SConscript | 5 - > src/util/disk_cache.c| 4 - > src/util/disk_cache.h| 42 -- > src/util/mesa-sha1.c | 242 > +-- > src/util/sha1/README | 55 > src/util/sha1/sha1.c | 173 + > src/util/sha1/sha1.h | 47 +++ > 12 files changed, 279 insertions(+), 466 deletions(-) > create mode 100644 src/util/sha1/README > create mode 100644 src/util/sha1/sha1.c > create mode 100644 src/util/sha1/sha1.h > > diff --git a/configure.ac b/configure.ac > index 459f3e8b0a..5772b378c7 100644 > --- a/configure.ac > +++ b/configure.ac > @@ -9,7 +9,6 @@ dnl Copyright ?? 2009-2014 Jon TURNEY > dnl Copyright ?? 2011-2012 Benjamin Franzke > dnl Copyright ?? 2008-2014 David Airlie > dnl Copyright ?? 2009-2013 Brian Paul > -dnl Copyright ?? 2003-2007 Keith Packard, Daniel Stone > dnl > dnl Permission is hereby granted, free of charge, to any person obtaining a > dnl copy of this software and associated documentation files (the > "Software"), > @@ -1432,151 +1431,6 @@ if test "x$enable_gallium_osmesa" = xyes; then > fi > fi > > -# SHA1 hashing > -AC_ARG_WITH([sha1], > - > [AS_HELP_STRING([--with-sha1=libc|libmd|libnettle|libgcrypt|libcrypto|libsha1|CommonCrypto|CryptoAPI], > -[choose SHA1 implementation])]) > -case "x$with_sha1" in > -x | xlibc | xlibmd | xlibnettle | xlibgcrypt | xlibcrypto | xlibsha1 | > xCommonCrypto | xCryptoAPI) > - ;; > -*) > -AC_MSG_ERROR([Illegal value for --with-sha1: $with_sha1]) > -esac > - > -AC_CHECK_FUNC([SHA1Init], [HAVE_SHA1_IN_LIBC=yes]) > -if test "x$with_sha1" = x && test "x$HAVE_SHA1_IN_LIBC" = xyes; then > - with_sha1=libc > -fi > -if test "x$with_sha1" = xlibc && test "x$HAVE_SHA1_IN_LIBC" != xyes; then > - AC_MSG_ERROR([sha1 in libc requested but not found]) > -fi > -if test "x$with_sha1" = xlibc; then > - AC_DEFINE([HAVE_SHA1_IN_LIBC], [1], > - [Use libc SHA1 functions]) > - SHA1_LIBS="" > -fi > -AC_CHECK_FUNC([CC_SHA1_Init], [HAVE_SHA1_IN_COMMONCRYPTO=yes]) > -if test "x$with_sha1" = x && test "x$HAVE_SHA1_IN_COMMONCRYPTO" = xyes; then > - with_sha1=CommonCrypto > -fi > -if test "x$with_sha1" = xCommonCrypto && test "x$HAVE_SHA1_IN_COMMONCRYPTO"
Re: [Mesa-dev] [PATCH] util: import sha1 implementation from OpenBSD
14.01.2017 01:45, Timothy Arceri пишет: I'm asking for a chance to test before we jump in, its probably not a big deal and I may even still be able to reduce my use of hashing but it would be nice to be given a few days to test and even explore alternatives before jumping on this implementation. A very quick and very dirty simple benchmark. I took shader-cache from github, branch shader-cache39. Then I've applied my preprocessor patch on top (because shader-cache still uses preprocessor even if the shader is cached and it was painful to see preprocessor taking more than half of the whole time). Then I've compiled it with openssl and with the Emil's patch. Full run on shader-db (300Mb+ of shaders) with shader-cache warmed up. It takes 78s, spends in libcrypto 0.27%. With OpenBSD SHA1 it runs approximately the same time, spends 0.53% in SHA1Transform() and other SHA1* functions. Subtest - 46Mb of shaders from Total War: Attila - 3.10s (for some reason, the cache works much faster on smaller subsets than on full shader-db). 1.08% were spent in libcrypto, 1.04% in sha1_block_data_order_avx2(). With OpenBSD 3.07s - 2.27% in SHA1Transform() and other SHA1* functions. Overall not that significant in context of shader-cache, but as expected, on Haswell it's twice slower than OpenSSL's AVX2 implementation. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] util: import sha1 implementation from OpenBSD
Thanks for the comments gents. This is the type of discussion I was aiming at. On 13 January 2017 at 22:45, Timothy Arceriwrote: > On Fri, 2017-01-13 at 14:32 -0800, Jason Ekstrand wrote: >> On Fri, Jan 13, 2017 at 2:18 PM, Timothy Arceri > bora.com> wrote: >> > On Fri, 2017-01-13 at 13:59 -0800, Jason Ekstrand wrote: >> > > On Fri, Jan 13, 2017 at 11:22 AM, Vladislav Egorov > > ail. >> > > com> wrote: >> > > > 13.01.2017 19:51, Emil Velikov пишет: >> > > > > From: Emil Velikov >> > > > > >> > > > > At the moment we support 5+ different implementations each >> > with >> > > > > varying >> > > > > amount of bugs - from thread safely problems [1], to outright >> > > > > broken >> > > > > implementation(s) [2] >> > > > > >> > > > > In order to accommodate these we have 150+ lines of configure >> > > > > script and >> > > > > extra two configure toggles. Whist an actual implementation >> > being >> > > > > ~200loc and our current compat wrapping ~250. >> > > >> > > Yes, this is a problem. Especially given that at least one of >> > those >> > > implementations (openssl?) is something that a certain major game >> > > distributor likes to hard-link into things causing interesting >> > and >> > > hard-to-debug problems. I am all for getting rid of the "piles >> > of >> > > different dependencies" approach. >> > > >> > > Also, something I would like to see (maybe a follow-on patch?) >> > would >> > > a change to the mesa internal API to be able to put the SHA >> > context >> > > on the stack and not need to malloc it. It's not really a memory >> > or >> > > cycle-saving thing so much as it leaves one fewer cleanup paths >> > you >> > > have to worry about. >> > > >> > > > > Let's not forget that different people use different code >> > paths, >> > > > > thus >> > > > > effectively makes it harder to test and debug since the >> > default >> > > > > implementation is automatically detected. >> > > > > >> > > > > To minimise all these lovely experiences, import the "100% >> > Public >> > > > > Domain" OpenBSD sha1 implementation. Clearly document any >> > changes >> > > > > needed >> > > > > to get building correctly, since many/most of those can be >> > > > > upstreamed >> > > > > making future syncs easier. >> > > > > >> > > > >> > > > It can hurt performance. OpenSSL implementation is optimized >> > for >> > > > all thinkable architectures and it will use hardware SHA-1 >> > > > instructions on newer CPUs. From https://github.com/openssl/ope >> > nssl >> > > > /blob/master/crypto/sha/asm/sha1-x86_64.pl : >> > > > >> > > > > Current performance is summarized in following table. Numbers >> > are >> > > > > CPU clock cycles spent to process single byte (less is >> > better). >> > > > > >> > > > >x86_64SSSE3AVX[2] >> > > > > P49.05- >> > > > > Opteron6.26- >> > > > > Core26.556.05/+8%- >> > > > > Westmere6.735.30/+27%- >> > > > > Sandy Bridge7.706.10/+26%4.99/+54% >> > > > > Ivy Bridge6.064.67/+30%4.60/+32% >> > > > > Haswell5.454.15/+31%3.57/+53% >> > > > > Skylake5.184.06/+28%3.54/+46% >> > > > > Bulldozer9.115.95/+53% >> > > > > VIA Nano9.327.15/+30% >> > > > > Atom10.39.17/+12% >> > > > > Silvermont13.1(*)9.37/+40% >> > > > > Goldmont8.136.42/+27%1.70/+380%(**) >> > > > >> > > > Quick benchmark on my Haswell of the OpenBSD implementation >> > > > compiled with GCC5 -O2: ~8 cycles per byte on 32-bit, ~7 cycles >> > per >> > > > byte on 64-bit. But Haswell is a very powerful CPU, on weaker >> > CPUs >> > > > the difference would be probably larger, especially on new CPUs >> > > > that have SHA instruction set. >> > > >> > > Thanks for the numbers. It sounds like, on Haswell, the openSSL >> > > implementation is about 2x as fast which is very useful to know. >> > > However, this isn't on a super perf-critical path. We never use >> > SHA1 >> > > on any draw-time paths; we always use a simpler hash function in >> > > those cases and reserve SHA1 for when we really don't want >> > > collisions. >> > >> > Actually the OpenGL shader cache uses it a draw time to find cached >> > variants. I looked at pulling an implementation into Mesa a while >> > ago >> > but found the perf drop wasn't worth it. >> >> Why doesn't the usual in-memory cache stand as a front-line defense? > > It does :) > >> Could you please be more specific about the perf implications >> you've seen? > > I'm asking for a chance to test before we jump in, its probably not a > big deal and I may even still be able to reduce my use of hashing but > it would be nice to be given a few days to test and even explore > alternatives before jumping on this implementation. > >> Also, which implementation were you linking to that was so much >>
Re: [Mesa-dev] [PATCH] util: import sha1 implementation from OpenBSD
On Fri, 2017-01-13 at 14:32 -0800, Jason Ekstrand wrote: > On Fri, Jan 13, 2017 at 2:18 PM, Timothy Arceribora.com> wrote: > > On Fri, 2017-01-13 at 13:59 -0800, Jason Ekstrand wrote: > > > On Fri, Jan 13, 2017 at 11:22 AM, Vladislav Egorov > ail. > > > com> wrote: > > > > 13.01.2017 19:51, Emil Velikov пишет: > > > > > From: Emil Velikov > > > > > > > > > > At the moment we support 5+ different implementations each > > with > > > > > varying > > > > > amount of bugs - from thread safely problems [1], to outright > > > > > broken > > > > > implementation(s) [2] > > > > > > > > > > In order to accommodate these we have 150+ lines of configure > > > > > script and > > > > > extra two configure toggles. Whist an actual implementation > > being > > > > > ~200loc and our current compat wrapping ~250. > > > > > > Yes, this is a problem. Especially given that at least one of > > those > > > implementations (openssl?) is something that a certain major game > > > distributor likes to hard-link into things causing interesting > > and > > > hard-to-debug problems. I am all for getting rid of the "piles > > of > > > different dependencies" approach. > > > > > > Also, something I would like to see (maybe a follow-on patch?) > > would > > > a change to the mesa internal API to be able to put the SHA > > context > > > on the stack and not need to malloc it. It's not really a memory > > or > > > cycle-saving thing so much as it leaves one fewer cleanup paths > > you > > > have to worry about. > > > > > > > > Let's not forget that different people use different code > > paths, > > > > > thus > > > > > effectively makes it harder to test and debug since the > > default > > > > > implementation is automatically detected. > > > > > > > > > > To minimise all these lovely experiences, import the "100% > > Public > > > > > Domain" OpenBSD sha1 implementation. Clearly document any > > changes > > > > > needed > > > > > to get building correctly, since many/most of those can be > > > > > upstreamed > > > > > making future syncs easier. > > > > > > > > > > > > > It can hurt performance. OpenSSL implementation is optimized > > for > > > > all thinkable architectures and it will use hardware SHA-1 > > > > instructions on newer CPUs. From https://github.com/openssl/ope > > nssl > > > > /blob/master/crypto/sha/asm/sha1-x86_64.pl : > > > > > > > > > Current performance is summarized in following table. Numbers > > are > > > > > CPU clock cycles spent to process single byte (less is > > better). > > > > > > > > > > x86_64 SSSE3 AVX[2] > > > > > P4 9.05 - > > > > > Opteron 6.26 - > > > > > Core2 6.55 6.05/+8% - > > > > > Westmere 6.73 5.30/+27% - > > > > > Sandy Bridge 7.70 6.10/+26% 4.99/+54% > > > > > Ivy Bridge 6.06 4.67/+30% 4.60/+32% > > > > > Haswell 5.45 4.15/+31% 3.57/+53% > > > > > Skylake 5.18 4.06/+28% 3.54/+46% > > > > > Bulldozer 9.11 5.95/+53% > > > > > VIA Nano 9.32 7.15/+30% > > > > > Atom 10.3 9.17/+12% > > > > > Silvermont 13.1(*) 9.37/+40% > > > > > Goldmont 8.13 6.42/+27% 1.70/+380%(**) > > > > > > > > Quick benchmark on my Haswell of the OpenBSD implementation > > > > compiled with GCC5 -O2: ~8 cycles per byte on 32-bit, ~7 cycles > > per > > > > byte on 64-bit. But Haswell is a very powerful CPU, on weaker > > CPUs > > > > the difference would be probably larger, especially on new CPUs > > > > that have SHA instruction set. > > > > > > Thanks for the numbers. It sounds like, on Haswell, the openSSL > > > implementation is about 2x as fast which is very useful to know. > > > However, this isn't on a super perf-critical path. We never use > > SHA1 > > > on any draw-time paths; we always use a simpler hash function in > > > those cases and reserve SHA1 for when we really don't want > > > collisions. > > > > Actually the OpenGL shader cache uses it a draw time to find cached > > variants. I looked at pulling an implementation into Mesa a while > > ago > > but found the perf drop wasn't worth it. > > Why doesn't the usual in-memory cache stand as a front-line defense? It does :) > Could you please be more specific about the perf implications > you've seen? I'm asking for a chance to test before we jump in, its probably not a big deal and I may even still be able to reduce my use of hashing but it would be nice to be given a few days to test and even explore alternatives before jumping on this implementation. > Also, which implementation were you linking to that was so much > faster? I didn't test the OpenBSD implementation I tried another small implementation that claimed it was fast. Pretty much any of the available libraries were much faster as you would expect from something that has been tweaked over the years. > > > I really like the idea of
Re: [Mesa-dev] [PATCH] util: import sha1 implementation from OpenBSD
On Fri, Jan 13, 2017 at 2:18 PM, Timothy Arceri < timothy.arc...@collabora.com> wrote: > On Fri, 2017-01-13 at 13:59 -0800, Jason Ekstrand wrote: > > On Fri, Jan 13, 2017 at 11:22 AM, Vladislav Egorov> com> wrote: > > > 13.01.2017 19:51, Emil Velikov пишет: > > > > From: Emil Velikov > > > > > > > > At the moment we support 5+ different implementations each with > > > > varying > > > > amount of bugs - from thread safely problems [1], to outright > > > > broken > > > > implementation(s) [2] > > > > > > > > In order to accommodate these we have 150+ lines of configure > > > > script and > > > > extra two configure toggles. Whist an actual implementation being > > > > ~200loc and our current compat wrapping ~250. > > > > Yes, this is a problem. Especially given that at least one of those > > implementations (openssl?) is something that a certain major game > > distributor likes to hard-link into things causing interesting and > > hard-to-debug problems. I am all for getting rid of the "piles of > > different dependencies" approach. > > > > Also, something I would like to see (maybe a follow-on patch?) would > > a change to the mesa internal API to be able to put the SHA context > > on the stack and not need to malloc it. It's not really a memory or > > cycle-saving thing so much as it leaves one fewer cleanup paths you > > have to worry about. > > > > > > Let's not forget that different people use different code paths, > > > > thus > > > > effectively makes it harder to test and debug since the default > > > > implementation is automatically detected. > > > > > > > > To minimise all these lovely experiences, import the "100% Public > > > > Domain" OpenBSD sha1 implementation. Clearly document any changes > > > > needed > > > > to get building correctly, since many/most of those can be > > > > upstreamed > > > > making future syncs easier. > > > > > > > > > > It can hurt performance. OpenSSL implementation is optimized for > > > all thinkable architectures and it will use hardware SHA-1 > > > instructions on newer CPUs. From https://github.com/openssl/openssl > > > /blob/master/crypto/sha/asm/sha1-x86_64.pl : > > > > > > > Current performance is summarized in following table. Numbers are > > > > CPU clock cycles spent to process single byte (less is better). > > > > > > > >x86_64SSSE3AVX[2] > > > > P49.05- > > > > Opteron6.26- > > > > Core26.556.05/+8%- > > > > Westmere6.735.30/+27%- > > > > Sandy Bridge7.706.10/+26%4.99/+54% > > > > Ivy Bridge6.064.67/+30%4.60/+32% > > > > Haswell5.454.15/+31%3.57/+53% > > > > Skylake5.184.06/+28%3.54/+46% > > > > Bulldozer9.115.95/+53% > > > > VIA Nano9.327.15/+30% > > > > Atom10.39.17/+12% > > > > Silvermont13.1(*)9.37/+40% > > > > Goldmont8.136.42/+27%1.70/+380%(**) > > > > > > Quick benchmark on my Haswell of the OpenBSD implementation > > > compiled with GCC5 -O2: ~8 cycles per byte on 32-bit, ~7 cycles per > > > byte on 64-bit. But Haswell is a very powerful CPU, on weaker CPUs > > > the difference would be probably larger, especially on new CPUs > > > that have SHA instruction set. > > > > Thanks for the numbers. It sounds like, on Haswell, the openSSL > > implementation is about 2x as fast which is very useful to know. > > However, this isn't on a super perf-critical path. We never use SHA1 > > on any draw-time paths; we always use a simpler hash function in > > those cases and reserve SHA1 for when we really don't want > > collisions. > > Actually the OpenGL shader cache uses it a draw time to find cached > variants. I looked at pulling an implementation into Mesa a while ago > but found the perf drop wasn't worth it. > Why doesn't the usual in-memory cache stand as a front-line defense? Could you please be more specific about the perf implications you've seen? Also, which implementation were you linking to that was so much faster? > I really like the idea of having an internal implementation but I don't > think we should dismiss performance so quickly it would be nice if we > could hold this off until more testing can be done. > > > That said, it's a bit more critical than Emil makes it sound. A > > typical Vulkan application may easily create 10k pipelines and each > > of those will involve hashing at least about 100B of data (not > > include the SPIR-V source). I doubt, however, that this is enough to > > really cause a problem given how much other work goes into building a > > pipeline. > > > > Unfortunately, the OpenSSL implementation, while fast, is one of the > > ones that is causing problems. One of our favorite game distributors > > likes to hard-link against openssl in some of their games and/or > > libraries (not sure which). This means that, if mesa tries to > >
Re: [Mesa-dev] [PATCH] util: import sha1 implementation from OpenBSD
14.01.2017 00:17, Matt Turner пишет: On Fri, Jan 13, 2017 at 1:01 PM, Vladislav Egorovwrote: 2017-01-13 22:43 GMT+03:00 Emil Velikov : On 13 January 2017 at 19:22, Vladislav Egorov wrote: 13.01.2017 19:51, Emil Velikov пишет: From: Emil Velikov At the moment we support 5+ different implementations each with varying amount of bugs - from thread safely problems [1], to outright broken implementation(s) [2] In order to accommodate these we have 150+ lines of configure script and extra two configure toggles. Whist an actual implementation being ~200loc and our current compat wrapping ~250. Let's not forget that different people use different code paths, thus effectively makes it harder to test and debug since the default implementation is automatically detected. To minimise all these lovely experiences, import the "100% Public Domain" OpenBSD sha1 implementation. Clearly document any changes needed to get building correctly, since many/most of those can be upstreamed making future syncs easier. It can hurt performance. This is not performance critical path ;-) If that ever changes we can rethink our options. Emil If it's used by shader-cache, it's certainly along the critical path. And 7-8 cycles per byte (or more than 10 cycles per byte on Atoms, Celerons and low-end AMDs) per byte of shader text is something to be considered. In comparison the entire preprocessing stage takes ~15 cycles per byte -- well, after my optimizations :) I regularly see util_hash_crc32() in perf top - because it uses inefficient table-based implementation with the same ~8 cycles per byte. Perhaps we should consider using CRC32C (for which an instruction exists in SSE 4.2 with a latency of 3 cycles) as the hashing function? http://stackoverflow.com/questions/2694740/can-one-construct-a-good-hash-function-using-crc32c-as-a-base Disregard my previous comment in part about util_hash_crc32(), it seems that my memory served me wrong, it's not anywhere near to the hottest functions in perf top. But generally speaking CRC32 from SSE4.2 can't be used as a drop-in replacement for util_hash_crc32(), because it uses a different polynomial (not to mention that it will require CPU-detection and so on). I don't know if it matters or not in that case. And if checksum/hash-function can be changed, maybe CRC32 should not be used at all (if CRC32 is used just as a non-cryptographic hash-function). There are much faster non-secure hash-functions around [1]. [1] https://github.com/rurban/smhasher#smhasher ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] util: import sha1 implementation from OpenBSD
On Fri, 2017-01-13 at 13:59 -0800, Jason Ekstrand wrote: > On Fri, Jan 13, 2017 at 11:22 AM, Vladislav Egorovcom> wrote: > > 13.01.2017 19:51, Emil Velikov пишет: > > > From: Emil Velikov > > > > > > At the moment we support 5+ different implementations each with > > > varying > > > amount of bugs - from thread safely problems [1], to outright > > > broken > > > implementation(s) [2] > > > > > > In order to accommodate these we have 150+ lines of configure > > > script and > > > extra two configure toggles. Whist an actual implementation being > > > ~200loc and our current compat wrapping ~250. > > Yes, this is a problem. Especially given that at least one of those > implementations (openssl?) is something that a certain major game > distributor likes to hard-link into things causing interesting and > hard-to-debug problems. I am all for getting rid of the "piles of > different dependencies" approach. > > Also, something I would like to see (maybe a follow-on patch?) would > a change to the mesa internal API to be able to put the SHA context > on the stack and not need to malloc it. It's not really a memory or > cycle-saving thing so much as it leaves one fewer cleanup paths you > have to worry about. > > > > Let's not forget that different people use different code paths, > > > thus > > > effectively makes it harder to test and debug since the default > > > implementation is automatically detected. > > > > > > To minimise all these lovely experiences, import the "100% Public > > > Domain" OpenBSD sha1 implementation. Clearly document any changes > > > needed > > > to get building correctly, since many/most of those can be > > > upstreamed > > > making future syncs easier. > > > > > > > It can hurt performance. OpenSSL implementation is optimized for > > all thinkable architectures and it will use hardware SHA-1 > > instructions on newer CPUs. From https://github.com/openssl/openssl > > /blob/master/crypto/sha/asm/sha1-x86_64.pl : > > > > > Current performance is summarized in following table. Numbers are > > > CPU clock cycles spent to process single byte (less is better). > > > > > > x86_64 SSSE3 AVX[2] > > > P4 9.05 - > > > Opteron 6.26 - > > > Core2 6.55 6.05/+8% - > > > Westmere 6.73 5.30/+27% - > > > Sandy Bridge 7.70 6.10/+26% 4.99/+54% > > > Ivy Bridge 6.06 4.67/+30% 4.60/+32% > > > Haswell 5.45 4.15/+31% 3.57/+53% > > > Skylake 5.18 4.06/+28% 3.54/+46% > > > Bulldozer 9.11 5.95/+53% > > > VIA Nano 9.32 7.15/+30% > > > Atom 10.3 9.17/+12% > > > Silvermont 13.1(*) 9.37/+40% > > > Goldmont 8.13 6.42/+27% 1.70/+380%(**) > > > > Quick benchmark on my Haswell of the OpenBSD implementation > > compiled with GCC5 -O2: ~8 cycles per byte on 32-bit, ~7 cycles per > > byte on 64-bit. But Haswell is a very powerful CPU, on weaker CPUs > > the difference would be probably larger, especially on new CPUs > > that have SHA instruction set. > > Thanks for the numbers. It sounds like, on Haswell, the openSSL > implementation is about 2x as fast which is very useful to know. > However, this isn't on a super perf-critical path. We never use SHA1 > on any draw-time paths; we always use a simpler hash function in > those cases and reserve SHA1 for when we really don't want > collisions. Actually the OpenGL shader cache uses it a draw time to find cached variants. I looked at pulling an implementation into Mesa a while ago but found the perf drop wasn't worth it. I really like the idea of having an internal implementation but I don't think we should dismiss performance so quickly it would be nice if we could hold this off until more testing can be done. > That said, it's a bit more critical than Emil makes it sound. A > typical Vulkan application may easily create 10k pipelines and each > of those will involve hashing at least about 100B of data (not > include the SPIR-V source). I doubt, however, that this is enough to > really cause a problem given how much other work goes into building a > pipeline. > > Unfortunately, the OpenSSL implementation, while fast, is one of the > ones that is causing problems. One of our favorite game distributors > likes to hard-link against openssl in some of their games and/or > libraries (not sure which). This means that, if mesa tries to > dynamically open libssl, you get mysterious crashes due to slight > differences between the system-installed version and the one that has > been linked into the game. This makes trying to use the OpenSSL > implementation a non-starter without being able to wholesale import > the implementation. > > Emil, I'm fine with this change. I haven't reviewed the details, but > my gut tells me we can eat the perf difference for now. Consider > that an Acked-by if you'd like but it would be good
Re: [Mesa-dev] [PATCH] util: import sha1 implementation from OpenBSD
On Fri, Jan 13, 2017 at 11:22 AM, Vladislav Egorovwrote: > 13.01.2017 19:51, Emil Velikov пишет: > >> From: Emil Velikov >> >> At the moment we support 5+ different implementations each with varying >> amount of bugs - from thread safely problems [1], to outright broken >> implementation(s) [2] >> >> In order to accommodate these we have 150+ lines of configure script and >> extra two configure toggles. Whist an actual implementation being >> ~200loc and our current compat wrapping ~250. >> > Yes, this is a problem. Especially given that at least one of those implementations (openssl?) is something that a certain major game distributor likes to hard-link into things causing interesting and hard-to-debug problems. I am all for getting rid of the "piles of different dependencies" approach. Also, something I would like to see (maybe a follow-on patch?) would a change to the mesa internal API to be able to put the SHA context on the stack and not need to malloc it. It's not really a memory or cycle-saving thing so much as it leaves one fewer cleanup paths you have to worry about. > Let's not forget that different people use different code paths, thus >> effectively makes it harder to test and debug since the default >> implementation is automatically detected. >> >> To minimise all these lovely experiences, import the "100% Public >> Domain" OpenBSD sha1 implementation. Clearly document any changes needed >> to get building correctly, since many/most of those can be upstreamed >> making future syncs easier. >> > > It can hurt performance. OpenSSL implementation is optimized for all > thinkable architectures and it will use hardware SHA-1 instructions on > newer CPUs. From https://github.com/openssl/ope > nssl/blob/master/crypto/sha/asm/sha1-x86_64.pl : > > > Current performance is summarized in following table. Numbers are > > CPU clock cycles spent to process single byte (less is better). > > > >x86_64SSSE3AVX[2] > > P49.05- > > Opteron6.26- > > Core26.556.05/+8%- > > Westmere6.735.30/+27%- > > Sandy Bridge7.706.10/+26%4.99/+54% > > Ivy Bridge6.064.67/+30%4.60/+32% > > Haswell5.454.15/+31%3.57/+53% > > Skylake5.184.06/+28%3.54/+46% > > Bulldozer9.115.95/+53% > > VIA Nano9.327.15/+30% > > Atom10.39.17/+12% > > Silvermont13.1(*)9.37/+40% > > Goldmont8.136.42/+27%1.70/+380%(**) > > Quick benchmark on my Haswell of the OpenBSD implementation compiled with > GCC5 -O2: ~8 cycles per byte on 32-bit, ~7 cycles per byte on 64-bit. But > Haswell is a very powerful CPU, on weaker CPUs the difference would be > probably larger, especially on new CPUs that have SHA instruction set. > Thanks for the numbers. It sounds like, on Haswell, the openSSL implementation is about 2x as fast which is very useful to know. However, this isn't on a super perf-critical path. We never use SHA1 on any draw-time paths; we always use a simpler hash function in those cases and reserve SHA1 for when we really don't want collisions. That said, it's a bit more critical than Emil makes it sound. A typical Vulkan application may easily create 10k pipelines and each of those will involve hashing at least about 100B of data (not include the SPIR-V source). I doubt, however, that this is enough to really cause a problem given how much other work goes into building a pipeline. Unfortunately, the OpenSSL implementation, while fast, is one of the ones that is causing problems. One of our favorite game distributors likes to hard-link against openssl in some of their games and/or libraries (not sure which). This means that, if mesa tries to dynamically open libssl, you get mysterious crashes due to slight differences between the system-installed version and the one that has been linked into the game. This makes trying to use the OpenSSL implementation a non-starter without being able to wholesale import the implementation. Emil, I'm fine with this change. I haven't reviewed the details, but my gut tells me we can eat the perf difference for now. Consider that an Acked-by if you'd like but it would be good to have someone review at least the build system stuff. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] util: import sha1 implementation from OpenBSD
On Fri, Jan 13, 2017 at 1:01 PM, Vladislav Egorovwrote: > 2017-01-13 22:43 GMT+03:00 Emil Velikov : >> >> On 13 January 2017 at 19:22, Vladislav Egorov wrote: >> > 13.01.2017 19:51, Emil Velikov пишет: >> >> >> >> From: Emil Velikov >> >> >> >> At the moment we support 5+ different implementations each with varying >> >> amount of bugs - from thread safely problems [1], to outright broken >> >> implementation(s) [2] >> >> >> >> In order to accommodate these we have 150+ lines of configure script and >> >> extra two configure toggles. Whist an actual implementation being >> >> ~200loc and our current compat wrapping ~250. >> >> >> >> Let's not forget that different people use different code paths, thus >> >> effectively makes it harder to test and debug since the default >> >> implementation is automatically detected. >> >> >> >> To minimise all these lovely experiences, import the "100% Public >> >> Domain" OpenBSD sha1 implementation. Clearly document any changes needed >> >> to get building correctly, since many/most of those can be upstreamed >> >> making future syncs easier. >> > >> > >> > It can hurt performance. >> This is not performance critical path ;-) If that ever changes we can >> rethink our options. >> >> Emil > > > If it's used by shader-cache, it's certainly along the critical path. > And 7-8 cycles per byte (or more than 10 cycles per byte on Atoms, > Celerons and low-end AMDs) per byte of shader text is something to be > considered. In comparison the entire preprocessing stage takes ~15 > cycles per byte -- well, after my optimizations :) I regularly see > util_hash_crc32() in perf top - because it uses inefficient > table-based implementation with the same ~8 cycles per byte. Perhaps we should consider using CRC32C (for which an instruction exists in SSE 4.2 with a latency of 3 cycles) as the hashing function? http://stackoverflow.com/questions/2694740/can-one-construct-a-good-hash-function-using-crc32c-as-a-base ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] util: import sha1 implementation from OpenBSD
I am generally in favor of this for all the reasons you've described. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] util: import sha1 implementation from OpenBSD
2017-01-13 22:43 GMT+03:00 Emil Velikov: > > On 13 January 2017 at 19:22, Vladislav Egorov wrote: > > 13.01.2017 19:51, Emil Velikov пишет: > >> > >> From: Emil Velikov > >> > >> At the moment we support 5+ different implementations each with varying > >> amount of bugs - from thread safely problems [1], to outright broken > >> implementation(s) [2] > >> > >> In order to accommodate these we have 150+ lines of configure script and > >> extra two configure toggles. Whist an actual implementation being > >> ~200loc and our current compat wrapping ~250. > >> > >> Let's not forget that different people use different code paths, thus > >> effectively makes it harder to test and debug since the default > >> implementation is automatically detected. > >> > >> To minimise all these lovely experiences, import the "100% Public > >> Domain" OpenBSD sha1 implementation. Clearly document any changes needed > >> to get building correctly, since many/most of those can be upstreamed > >> making future syncs easier. > > > > > > It can hurt performance. > This is not performance critical path ;-) If that ever changes we can > rethink our options. > > Emil If it's used by shader-cache, it's certainly along the critical path. And 7-8 cycles per byte (or more than 10 cycles per byte on Atoms, Celerons and low-end AMDs) per byte of shader text is something to be considered. In comparison the entire preprocessing stage takes ~15 cycles per byte -- well, after my optimizations :) I regularly see util_hash_crc32() in perf top - because it uses inefficient table-based implementation with the same ~8 cycles per byte. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] util: import sha1 implementation from OpenBSD
On Fri, Jan 13, 2017 at 11:51 AM, Dylan Bakerwrote: > Quoting Emil Velikov (2017-01-13 08:51:31) >> From: Emil Velikov >> >> At the moment we support 5+ different implementations each with varying >> amount of bugs - from thread safely problems [1], to outright broken >> implementation(s) [2] >> >> In order to accommodate these we have 150+ lines of configure script and >> extra two configure toggles. Whist an actual implementation being >> ~200loc and our current compat wrapping ~250. >> >> Let's not forget that different people use different code paths, thus >> effectively makes it harder to test and debug since the default >> implementation is automatically detected. >> >> To minimise all these lovely experiences, import the "100% Public >> Domain" OpenBSD sha1 implementation. Clearly document any changes needed >> to get building correctly, since many/most of those can be upstreamed >> making future syncs easier. >> >> As an added bonus this will avoid all the 'fun' experiences trying to >> integrate it with the Android and SCons builds. >> >> Bugzilla [1]: https://bugs.freedesktop.org/show_bug.cgi?id=94904 >> Bugzilla [2]: https://bugs.freedesktop.org/show_bug.cgi?id=97967 >> Cc: Mark Janes >> Cc: Vinson Lee >> Cc: Tapani Pälli >> Cc: Jonathan Gray >> Signed-off-by: Emil Velikov >> --- >> configure.ac | 161 +-- >> src/compiler/glsl/tests/cache_test.c | 5 - >> src/mesa/main/shaderapi.c| 6 - >> src/util/Makefile.am | 3 - >> src/util/Makefile.sources| 2 + >> src/util/SConscript | 5 - >> src/util/disk_cache.c| 4 - >> src/util/disk_cache.h| 42 -- >> src/util/mesa-sha1.c | 242 >> +-- >> src/util/sha1/README | 55 >> src/util/sha1/sha1.c | 173 + >> src/util/sha1/sha1.h | 47 +++ >> 12 files changed, 279 insertions(+), 466 deletions(-) >> create mode 100644 src/util/sha1/README >> create mode 100644 src/util/sha1/sha1.c >> create mode 100644 src/util/sha1/sha1.h >> >> diff --git a/configure.ac b/configure.ac >> index 459f3e8b0a..5772b378c7 100644 >> --- a/configure.ac >> +++ b/configure.ac >> @@ -9,7 +9,6 @@ dnl Copyright © 2009-2014 Jon TURNEY >> dnl Copyright © 2011-2012 Benjamin Franzke >> dnl Copyright © 2008-2014 David Airlie >> dnl Copyright © 2009-2013 Brian Paul >> -dnl Copyright © 2003-2007 Keith Packard, Daniel Stone > > This change seems like a mistake? Actually no, since that line was added in commit a24bdce46 which was the import of all of the SHA1 configuration machinery from the xserver. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] util: import sha1 implementation from OpenBSD
Quoting Emil Velikov (2017-01-13 08:51:31) > From: Emil Velikov> > At the moment we support 5+ different implementations each with varying > amount of bugs - from thread safely problems [1], to outright broken > implementation(s) [2] > > In order to accommodate these we have 150+ lines of configure script and > extra two configure toggles. Whist an actual implementation being > ~200loc and our current compat wrapping ~250. > > Let's not forget that different people use different code paths, thus > effectively makes it harder to test and debug since the default > implementation is automatically detected. > > To minimise all these lovely experiences, import the "100% Public > Domain" OpenBSD sha1 implementation. Clearly document any changes needed > to get building correctly, since many/most of those can be upstreamed > making future syncs easier. > > As an added bonus this will avoid all the 'fun' experiences trying to > integrate it with the Android and SCons builds. > > Bugzilla [1]: https://bugs.freedesktop.org/show_bug.cgi?id=94904 > Bugzilla [2]: https://bugs.freedesktop.org/show_bug.cgi?id=97967 > Cc: Mark Janes > Cc: Vinson Lee > Cc: Tapani Pälli > Cc: Jonathan Gray > Signed-off-by: Emil Velikov > --- > configure.ac | 161 +-- > src/compiler/glsl/tests/cache_test.c | 5 - > src/mesa/main/shaderapi.c| 6 - > src/util/Makefile.am | 3 - > src/util/Makefile.sources| 2 + > src/util/SConscript | 5 - > src/util/disk_cache.c| 4 - > src/util/disk_cache.h| 42 -- > src/util/mesa-sha1.c | 242 > +-- > src/util/sha1/README | 55 > src/util/sha1/sha1.c | 173 + > src/util/sha1/sha1.h | 47 +++ > 12 files changed, 279 insertions(+), 466 deletions(-) > create mode 100644 src/util/sha1/README > create mode 100644 src/util/sha1/sha1.c > create mode 100644 src/util/sha1/sha1.h > > diff --git a/configure.ac b/configure.ac > index 459f3e8b0a..5772b378c7 100644 > --- a/configure.ac > +++ b/configure.ac > @@ -9,7 +9,6 @@ dnl Copyright © 2009-2014 Jon TURNEY > dnl Copyright © 2011-2012 Benjamin Franzke > dnl Copyright © 2008-2014 David Airlie > dnl Copyright © 2009-2013 Brian Paul > -dnl Copyright © 2003-2007 Keith Packard, Daniel Stone This change seems like a mistake? > dnl > dnl Permission is hereby granted, free of charge, to any person obtaining a > dnl copy of this software and associated documentation files (the > "Software"), > @@ -1432,151 +1431,6 @@ if test "x$enable_gallium_osmesa" = xyes; then > fi > fi > > -# SHA1 hashing > -AC_ARG_WITH([sha1], > - > [AS_HELP_STRING([--with-sha1=libc|libmd|libnettle|libgcrypt|libcrypto|libsha1|CommonCrypto|CryptoAPI], > -[choose SHA1 implementation])]) > -case "x$with_sha1" in > -x | xlibc | xlibmd | xlibnettle | xlibgcrypt | xlibcrypto | xlibsha1 | > xCommonCrypto | xCryptoAPI) > - ;; > -*) > -AC_MSG_ERROR([Illegal value for --with-sha1: $with_sha1]) > -esac > - > -AC_CHECK_FUNC([SHA1Init], [HAVE_SHA1_IN_LIBC=yes]) > -if test "x$with_sha1" = x && test "x$HAVE_SHA1_IN_LIBC" = xyes; then > - with_sha1=libc > -fi > -if test "x$with_sha1" = xlibc && test "x$HAVE_SHA1_IN_LIBC" != xyes; then > - AC_MSG_ERROR([sha1 in libc requested but not found]) > -fi > -if test "x$with_sha1" = xlibc; then > - AC_DEFINE([HAVE_SHA1_IN_LIBC], [1], > - [Use libc SHA1 functions]) > - SHA1_LIBS="" > -fi > -AC_CHECK_FUNC([CC_SHA1_Init], [HAVE_SHA1_IN_COMMONCRYPTO=yes]) > -if test "x$with_sha1" = x && test "x$HAVE_SHA1_IN_COMMONCRYPTO" = xyes; then > - with_sha1=CommonCrypto > -fi > -if test "x$with_sha1" = xCommonCrypto && test "x$HAVE_SHA1_IN_COMMONCRYPTO" > != xyes; then > - AC_MSG_ERROR([CommonCrypto requested but not found]) > -fi > -if test "x$with_sha1" = xCommonCrypto; then > - AC_DEFINE([HAVE_SHA1_IN_COMMONCRYPTO], [1], > - [Use CommonCrypto SHA1 functions]) > - SHA1_LIBS="" > -fi > -dnl stdcall functions cannot be tested with AC_CHECK_LIB > -AC_CHECK_HEADER([wincrypt.h], [HAVE_SHA1_IN_CRYPTOAPI=yes], [], [#include > ]) > -if test "x$with_sha1" = x && test "x$HAVE_SHA1_IN_CRYPTOAPI" = xyes; then > - with_sha1=CryptoAPI > -fi > -if test "x$with_sha1" = xCryptoAPI && test "x$HAVE_SHA1_IN_CRYPTOAPI" != > xyes; then > - AC_MSG_ERROR([CryptoAPI requested but not found]) > -fi > -if test "x$with_sha1" = xCryptoAPI; then > - AC_DEFINE([HAVE_SHA1_IN_CRYPTOAPI], [1], > - [Use CryptoAPI SHA1 functions]) > - SHA1_LIBS="" > -fi > -AC_CHECK_LIB([md], [SHA1Init], [HAVE_LIBMD=yes]) > -if test "x$with_sha1" = x
Re: [Mesa-dev] [PATCH] util: import sha1 implementation from OpenBSD
On 13 January 2017 at 19:22, Vladislav Egorovwrote: > 13.01.2017 19:51, Emil Velikov пишет: >> >> From: Emil Velikov >> >> At the moment we support 5+ different implementations each with varying >> amount of bugs - from thread safely problems [1], to outright broken >> implementation(s) [2] >> >> In order to accommodate these we have 150+ lines of configure script and >> extra two configure toggles. Whist an actual implementation being >> ~200loc and our current compat wrapping ~250. >> >> Let's not forget that different people use different code paths, thus >> effectively makes it harder to test and debug since the default >> implementation is automatically detected. >> >> To minimise all these lovely experiences, import the "100% Public >> Domain" OpenBSD sha1 implementation. Clearly document any changes needed >> to get building correctly, since many/most of those can be upstreamed >> making future syncs easier. > > > It can hurt performance. This is not performance critical path ;-) If that ever changes we can rethink our options. Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] util: import sha1 implementation from OpenBSD
13.01.2017 19:51, Emil Velikov пишет: From: Emil VelikovAt the moment we support 5+ different implementations each with varying amount of bugs - from thread safely problems [1], to outright broken implementation(s) [2] In order to accommodate these we have 150+ lines of configure script and extra two configure toggles. Whist an actual implementation being ~200loc and our current compat wrapping ~250. Let's not forget that different people use different code paths, thus effectively makes it harder to test and debug since the default implementation is automatically detected. To minimise all these lovely experiences, import the "100% Public Domain" OpenBSD sha1 implementation. Clearly document any changes needed to get building correctly, since many/most of those can be upstreamed making future syncs easier. It can hurt performance. OpenSSL implementation is optimized for all thinkable architectures and it will use hardware SHA-1 instructions on newer CPUs. From https://github.com/openssl/openssl/blob/master/crypto/sha/asm/sha1-x86_64.pl : > Current performance is summarized in following table. Numbers are > CPU clock cycles spent to process single byte (less is better). > >x86_64SSSE3AVX[2] > P49.05- > Opteron6.26- > Core26.556.05/+8%- > Westmere6.735.30/+27%- > Sandy Bridge7.706.10/+26%4.99/+54% > Ivy Bridge6.064.67/+30%4.60/+32% > Haswell5.454.15/+31%3.57/+53% > Skylake5.184.06/+28%3.54/+46% > Bulldozer9.115.95/+53% > VIA Nano9.327.15/+30% > Atom10.39.17/+12% > Silvermont13.1(*)9.37/+40% > Goldmont8.136.42/+27%1.70/+380%(**) Quick benchmark on my Haswell of the OpenBSD implementation compiled with GCC5 -O2: ~8 cycles per byte on 32-bit, ~7 cycles per byte on 64-bit. But Haswell is a very powerful CPU, on weaker CPUs the difference would be probably larger, especially on new CPUs that have SHA instruction set. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] util: import sha1 implementation from OpenBSD
From: Emil VelikovAt the moment we support 5+ different implementations each with varying amount of bugs - from thread safely problems [1], to outright broken implementation(s) [2] In order to accommodate these we have 150+ lines of configure script and extra two configure toggles. Whist an actual implementation being ~200loc and our current compat wrapping ~250. Let's not forget that different people use different code paths, thus effectively makes it harder to test and debug since the default implementation is automatically detected. To minimise all these lovely experiences, import the "100% Public Domain" OpenBSD sha1 implementation. Clearly document any changes needed to get building correctly, since many/most of those can be upstreamed making future syncs easier. As an added bonus this will avoid all the 'fun' experiences trying to integrate it with the Android and SCons builds. Bugzilla [1]: https://bugs.freedesktop.org/show_bug.cgi?id=94904 Bugzilla [2]: https://bugs.freedesktop.org/show_bug.cgi?id=97967 Cc: Mark Janes Cc: Vinson Lee Cc: Tapani Pälli Cc: Jonathan Gray Signed-off-by: Emil Velikov --- configure.ac | 161 +-- src/compiler/glsl/tests/cache_test.c | 5 - src/mesa/main/shaderapi.c| 6 - src/util/Makefile.am | 3 - src/util/Makefile.sources| 2 + src/util/SConscript | 5 - src/util/disk_cache.c| 4 - src/util/disk_cache.h| 42 -- src/util/mesa-sha1.c | 242 +-- src/util/sha1/README | 55 src/util/sha1/sha1.c | 173 + src/util/sha1/sha1.h | 47 +++ 12 files changed, 279 insertions(+), 466 deletions(-) create mode 100644 src/util/sha1/README create mode 100644 src/util/sha1/sha1.c create mode 100644 src/util/sha1/sha1.h diff --git a/configure.ac b/configure.ac index 459f3e8b0a..5772b378c7 100644 --- a/configure.ac +++ b/configure.ac @@ -9,7 +9,6 @@ dnl Copyright © 2009-2014 Jon TURNEY dnl Copyright © 2011-2012 Benjamin Franzke dnl Copyright © 2008-2014 David Airlie dnl Copyright © 2009-2013 Brian Paul -dnl Copyright © 2003-2007 Keith Packard, Daniel Stone dnl dnl Permission is hereby granted, free of charge, to any person obtaining a dnl copy of this software and associated documentation files (the "Software"), @@ -1432,151 +1431,6 @@ if test "x$enable_gallium_osmesa" = xyes; then fi fi -# SHA1 hashing -AC_ARG_WITH([sha1], - [AS_HELP_STRING([--with-sha1=libc|libmd|libnettle|libgcrypt|libcrypto|libsha1|CommonCrypto|CryptoAPI], -[choose SHA1 implementation])]) -case "x$with_sha1" in -x | xlibc | xlibmd | xlibnettle | xlibgcrypt | xlibcrypto | xlibsha1 | xCommonCrypto | xCryptoAPI) - ;; -*) -AC_MSG_ERROR([Illegal value for --with-sha1: $with_sha1]) -esac - -AC_CHECK_FUNC([SHA1Init], [HAVE_SHA1_IN_LIBC=yes]) -if test "x$with_sha1" = x && test "x$HAVE_SHA1_IN_LIBC" = xyes; then - with_sha1=libc -fi -if test "x$with_sha1" = xlibc && test "x$HAVE_SHA1_IN_LIBC" != xyes; then - AC_MSG_ERROR([sha1 in libc requested but not found]) -fi -if test "x$with_sha1" = xlibc; then - AC_DEFINE([HAVE_SHA1_IN_LIBC], [1], - [Use libc SHA1 functions]) - SHA1_LIBS="" -fi -AC_CHECK_FUNC([CC_SHA1_Init], [HAVE_SHA1_IN_COMMONCRYPTO=yes]) -if test "x$with_sha1" = x && test "x$HAVE_SHA1_IN_COMMONCRYPTO" = xyes; then - with_sha1=CommonCrypto -fi -if test "x$with_sha1" = xCommonCrypto && test "x$HAVE_SHA1_IN_COMMONCRYPTO" != xyes; then - AC_MSG_ERROR([CommonCrypto requested but not found]) -fi -if test "x$with_sha1" = xCommonCrypto; then - AC_DEFINE([HAVE_SHA1_IN_COMMONCRYPTO], [1], - [Use CommonCrypto SHA1 functions]) - SHA1_LIBS="" -fi -dnl stdcall functions cannot be tested with AC_CHECK_LIB -AC_CHECK_HEADER([wincrypt.h], [HAVE_SHA1_IN_CRYPTOAPI=yes], [], [#include ]) -if test "x$with_sha1" = x && test "x$HAVE_SHA1_IN_CRYPTOAPI" = xyes; then - with_sha1=CryptoAPI -fi -if test "x$with_sha1" = xCryptoAPI && test "x$HAVE_SHA1_IN_CRYPTOAPI" != xyes; then - AC_MSG_ERROR([CryptoAPI requested but not found]) -fi -if test "x$with_sha1" = xCryptoAPI; then - AC_DEFINE([HAVE_SHA1_IN_CRYPTOAPI], [1], - [Use CryptoAPI SHA1 functions]) - SHA1_LIBS="" -fi -AC_CHECK_LIB([md], [SHA1Init], [HAVE_LIBMD=yes]) -if test "x$with_sha1" = x && test "x$HAVE_LIBMD" = xyes; then - with_sha1=libmd -fi -if test "x$with_sha1" = xlibmd && test "x$HAVE_LIBMD" != xyes; then - AC_MSG_ERROR([libmd requested but not found]) -fi -if test "x$with_sha1" = xlibmd; then - AC_DEFINE([HAVE_SHA1_IN_LIBMD], [1], - [Use libmd SHA1 functions])