13.05.18 20:42, Christian Heimes пише:
I was against the approach a good reason. The PR adds additional CPU
instructions and changes memory access pattern in a critical path of
CPython. There is no reason to add additional overhead for the majority
of users or X86 and X86_64 architectures. The memcpy() call should only
be used on architectures that do not support unaligned memory access.
See comment https://bugs.python.org/issue28055#msg276257
At least for latest GCC, the change seems to be fine. GCC emits the same
assembly code for X86_64 before and after your change. Did you check the
output on other CPU architectures as well as clang and MSVC, too?
For the initial implementation of pyhash.c [1] I proposed a patch that
use memcpy() conditionally to avoid an overhead on Windows:
+#ifdef _MSC_VER
+ block.value = *(const Py_uhash_t*)p;
+#else
+ memcpy(block.bytes, p, SIZEOF_PY_UHASH_T);
+#endif
(and similar code for FNV).
But many developers confirmed that all modern compilers including latest
versions of MS VS optimize memcpy() with a constant size into a single
CPU instruction. Seems avoiding to use memcpy() no longer needed.
If using memcpy() adds an overhead on some platforms we can return to
using a compiler/platform depending code.
[1] https://bugs.python.org/issue19183
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com