On Fri, Jan 13, 2017 at 2:18 PM, Timothy Arceri <
timothy.arc...@collabora.com> wrote:

> On Fri, 2017-01-13 at 13:59 -0800, Jason Ekstrand wrote:
> > On Fri, Jan 13, 2017 at 11:22 AM, Vladislav Egorov <vegorov180@gmail.
> > com> wrote:
> > > 13.01.2017 19:51, Emil Velikov пишет:
> > > > From: Emil Velikov <emil.veli...@collabora.com>
> > > >
> > > > At the moment we support 5+ different implementations each with
> > > > varying
> > > > amount of bugs - from thread safely problems [1], to outright
> > > > broken
> > > > implementation(s) [2]
> > > >
> > > > In order to accommodate these we have 150+ lines of configure
> > > > script and
> > > > extra two configure toggles. Whist an actual implementation being
> > > > ~200loc and our current compat wrapping ~250.
> >
> > Yes, this is a problem.  Especially given that at least one of those
> > implementations (openssl?) is something that a certain major game
> > distributor likes to hard-link into things causing interesting and
> > hard-to-debug problems.  I am all for getting rid of the "piles of
> > different dependencies" approach.
> >
> > Also, something I would like to see (maybe a follow-on patch?) would
> > a change to the mesa internal API to be able to put the SHA context
> > on the stack and not need to malloc it.  It's not really a memory or
> > cycle-saving thing so much as it leaves one fewer cleanup paths you
> > have to worry about.
> >
> > > > Let's not forget that different people use different code paths,
> > > > thus
> > > > effectively makes it harder to test and debug since the default
> > > > implementation is automatically detected.
> > > >
> > > > To minimise all these lovely experiences, import the "100% Public
> > > > Domain" OpenBSD sha1 implementation. Clearly document any changes
> > > > needed
> > > > to get building correctly, since many/most of those can be
> > > > upstreamed
> > > > making future syncs easier.
> > > >
> > >
> > > It can hurt performance. OpenSSL implementation is optimized for
> > > all thinkable architectures and it will use hardware SHA-1
> > > instructions on newer CPUs. From https://github.com/openssl/openssl
> > > /blob/master/crypto/sha/asm/sha1-x86_64.pl :
> > >
> > > > Current performance is summarized in following table. Numbers are
> > > > CPU clock cycles spent to process single byte (less is better).
> > > >
> > > >        x86_64        SSSE3        AVX[2]
> > > > P4        9.05        -
> > > > Opteron    6.26        -
> > > > Core2        6.55        6.05/+8%    -
> > > > Westmere    6.73        5.30/+27%    -
> > > > Sandy Bridge    7.70        6.10/+26%    4.99/+54%
> > > > Ivy Bridge    6.06        4.67/+30%    4.60/+32%
> > > > Haswell    5.45        4.15/+31%    3.57/+53%
> > > > Skylake    5.18        4.06/+28%    3.54/+46%
> > > > Bulldozer    9.11        5.95/+53%
> > > > VIA Nano    9.32        7.15/+30%
> > > > Atom        10.3        9.17/+12%
> > > > Silvermont    13.1(*)        9.37/+40%
> > > > Goldmont    8.13        6.42/+27%    1.70/+380%(**)
> > >
> > > Quick benchmark on my Haswell of the OpenBSD implementation
> > > compiled with GCC5 -O2: ~8 cycles per byte on 32-bit, ~7 cycles per
> > > byte on 64-bit. But Haswell is a very powerful CPU, on weaker CPUs
> > > the difference would be probably larger, especially on new CPUs
> > > that have SHA instruction set.
> >
> > Thanks for the numbers.  It sounds like, on Haswell, the openSSL
> > implementation is about 2x as fast which is very useful to know.
> > However, this isn't on a super perf-critical path.  We never use SHA1
> > on any draw-time paths; we always use a simpler hash function in
> > those cases and reserve SHA1 for when we really don't want
> > collisions.
>
> Actually the OpenGL shader cache uses it a draw time to find cached
> variants. I looked at pulling an implementation into Mesa a while ago
> but found the perf drop wasn't worth it.
>

Why doesn't the usual in-memory cache stand as a front-line defense?  Could
you please be more specific about the perf implications you've seen?  Also,
which implementation were you linking to that was so much faster?


> I really like the idea of having an internal implementation but I don't
> think we should dismiss performance so quickly it would be nice if we
> could hold this off until more testing can be done.
>
> >   That said, it's a bit more critical than Emil makes it sound.  A
> > typical Vulkan application may easily create 10k pipelines and each
> > of those will involve hashing at least about 100B of data (not
> > include the SPIR-V source).  I doubt, however, that this is enough to
> > really cause a problem given how much other work goes into building a
> > pipeline.
> >
> > Unfortunately, the OpenSSL implementation, while fast, is one of the
> > ones that is causing problems.  One of our favorite game distributors
> > likes to hard-link against openssl in some of their games and/or
> > libraries (not sure which).  This means that, if mesa tries to
> > dynamically open libssl, you get mysterious crashes due to slight
> > differences between the system-installed version and the one that has
> > been linked into the game.  This makes trying to use the OpenSSL
> > implementation a non-starter without being able to wholesale import
> > the implementation.
> >
> > Emil, I'm fine with this change.  I haven't reviewed the details, but
> > my gut tells me we can eat the perf difference for now.  Consider
> > that an Acked-by if you'd like but it would be good to have someone
> > review at least the build system stuff.
> > _______________________________________________
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reply via email to