On Wed, Jan 3, 2018 at 3:09 AM, Ian Romanick <[email protected]> wrote:
> On 01/02/2018 04:52 PM, Grazvydas Ignotas wrote:
>> On Tue, Jan 2, 2018 at 11:38 PM, Ian Romanick <[email protected]> wrote:
>>> On 12/28/2017 05:56 PM, Grazvydas Ignotas wrote:
>>>> zlib provides a faster slice-by-4 CRC32 implementation than the
>>>> traditional single byte lookup one used by mesa. As most supported
>>>> platforms now link zlib unconditionally, we can easily use it.
>>>> For small buffers the old implementation is still used as it's faster
>>>> with cold cache (first call), as indicated by some throughput
>>>> benchmarking (avg MB/s, n=100, zlib 1.2.8):
>>>>
>>>>             i5-6600K                    C2D E4500
>>>> size      mesa zlib                    mesa zlib
>>>> 4           66   43 -35% +/- 4.8%        43   22 -49% +/- 9.6%
>>>> 32         193  171 -11% +/- 5.8%       129   49 -61% +/- 7.2%
>>>> 64         256  267   4% +/- 4.1%       171   63 -63% +/- 5.4%
>>>> 128        317  389  22% +/- 5.8%       253   89 -64% +/- 4.2%
>>>> 256        364  596  63% +/- 5.6%       304  166 -45% +/- 2.8%
>>>> 512        401  838 108% +/- 5.3%       338  296 -12% +/- 3.1%
>>>> 1024       420 1036 146% +/- 7.6%       375  461  23% +/- 3.7%
>>>> 1M         443 1443 225% +/- 2.1%       403 1175 191% +/- 0.9%
>>>> 100M       448 1452 224% +/- 0.3%       406 1214 198% +/- 0.3%
>>>>
>>>> With hot cache (repeated calls) zlib almost always wins on both CPUS.
>>>> It has been verified the calculation results stay the same after this
>>>> change.
>>>>
>>>> Signed-off-by: Grazvydas Ignotas <[email protected]>
>>>> ---
>>>>  src/util/crc32.c | 13 +++++++++++++
>>>>  1 file changed, 13 insertions(+)
>>>>
>>>> diff --git a/src/util/crc32.c b/src/util/crc32.c
>>>> index f2e01c6..0cffa49 100644
>>>> --- a/src/util/crc32.c
>>>> +++ b/src/util/crc32.c
>>>> @@ -31,12 +31,20 @@
>>>>   *
>>>>   * @author Jose Fonseca
>>>>   */
>>>>
>>>>
>>>> +#ifdef HAVE_ZLIB
>>>> +#include <zlib.h>
>>>> +#endif
>>>>  #include "crc32.h"
>>>>
>>>> +/* For small buffers it's faster to avoid the library call.
>>>> + * The optimal threshold depends on CPU characteristics, it is hoped
>>>> + * the choice below is reasonable for typical modern CPU.
>>>> + */
>>>> +#define ZLIB_SIZE_THRESHOLD 64
>>>
>>> For the actual users of this function in Mesa, is it even possible to
>>> pass less than 64 bytes (I'm assuming that's the units)?
>>
>> Hmm why wouldn't it be? The unit is a byte, and you can compute CRC32
>> of a single byte.
>
> It can be done, but that's not my question.  My question is: would any
> of the existing users actually do this?  I thought the main user was the
> various disk cache / GetProgramBinary kind of things.  You won't have a
> shader binary that's a single byte... less than 64 also seems unlikely.
> Unless there are other users?

You're most likely right, I'll just drop this.

>
>>>>
>>>>  static const uint32_t
>>>>  util_crc32_table[256] = {
>>>>     0x00000000, 0x77073096, 0xee0e612c, 0x990951ba,
>>>>     0x076dc419, 0x706af48f, 0xe963a535, 0x9e6495a3,
>>>> @@ -112,10 +120,15 @@ uint32_t
>>>>  util_hash_crc32(const void *data, size_t size)
>>>>  {
>>>>     const uint8_t *p = data;
>>>>     uint32_t crc = 0xffffffff;
>>>>
>>>> +#ifdef HAVE_ZLIB
>>>> +   if (size >= ZLIB_SIZE_THRESHOLD && (uInt)size == size)
>>>                                          ^^^^^^^^^^^^^^^^^^
>>> This comparison is always true (unless sizeof(uInt) != sizeof(size_t)?).
>>>  I'm not 100% sure what you're trying to accomplish here, but I think
>>> you want 'size > 0'.  I'm not familiar with this zlib function, so I
>>> don't know what it's expectations for size are.
>>
>> zlib's uInt is always 32bit while size_t is 64bit on most (all?) 64bit
>> architectures, so if someone decides to CRC32 >= 4GiB buffer, this
>> function will do the wrong thing without such check. Newer zlib has
>> crc32_z that takes size_t, but I was trying to avoid build system
>> complications of detecting that function...
>
> Ok.  That makes sense.  I'll bet this adds a warning about tautological
> compares only on 32-bit, then.

I don't seem to be getting a warning on gcc 7.2.0 or 5.4.0 as well as
clang 3.8.0 at least.

>  I was going to suggest comparing with
> 0xffffffff instead, but GCC emits worse code for that... and it probably
> still results in the warning on 32-bit.  Maybe just a comment (based on
> your reply) that describes what's happening for the next person that
> reads the code.

ok
_______________________________________________
mesa-dev mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reply via email to