In perl.git, the branch tonyc/133495-digest-md5 has been created
<https://perl5.git.perl.org/perl.git/commitdiff/686a07f48a98cac22ae2aac42df706b176496784?hp=0000000000000000000000000000000000000000>
at 686a07f48a98cac22ae2aac42df706b176496784 (commit)
- Log -----------------------------------------------------------------
commit 686a07f48a98cac22ae2aac42df706b176496784
Author: Tony Cook <[email protected]>
Date: Tue Oct 8 11:18:43 2019 +1100
Digest::MD5 is customized
commit 06e6ef8e32e28d977557b12e340a09bf453c69c0
Author: Tony Cook <[email protected]>
Date: Tue Oct 8 11:10:04 2019 +1100
Matt Turner is now a perl author
commit e31f3d3767e7d53f56f4bf96ac71a6e8dcec3e95
Author: Tony Cook <[email protected]>
Date: Tue Oct 8 10:41:46 2019 +1100
(perl #133495) remove probing for d_u32align
Compiler optimisation meant the could return the wrong result in some
cases.
This wasn't a problem on x86, but on platforms where alignment is
required it caused problems.
Strangely enough d_u32align is "define" in the win32 config files,
on any x64 system (the probe only checked on 32-bit systems), on
ARM and on the one i386 build I checked.
This should have little to no effect on performance, for example,
building:
typedef unsigned long U64;
U64 b64little(unsigned char *p) {
return *p | ((U64)p[1] << 8) | ((U64)p[2] << 16) | ((U64)p[3] << 24)
| ((U64)p[4] << 32) | ((U64)p[5] << 40) | ((U64)p[6] << 48) |
((U64)p[7] << 56);
}
U64 b64big(unsigned char *p) {
return ((U64)*p << 56) | ((U64)p[1] << 48) | ((U64)p[2] << 40) |
((U64)p[3] << 32)
| ((U64)p[4] << 24) | ((U64)p[5] << 16) | ((U64)p[6] << 8) |
((U64)p[7]);
}
unsigned b32little(unsigned char *p) {
return *p | ((unsigned)p[1] << 8) | ((unsigned)p[2] << 16) |
((unsigned)p[3] << 24);
}
unsigned b32big(unsigned char *p) {
return ((unsigned)p[0] << 24) | ((unsigned)p[1] << 16) |
((unsigned)p[2] << 8) | p[3];
}
with:
gcc -O2 -S test.c
produces:
.file "test.c"
.text
.p2align 4,,15
.globl b64little
.type b64little, @function
b64little:
.LFB0:
.cfi_startproc
movq (%rdi), %rax
ret
.cfi_endproc
.LFE0:
.size b64little, .-b64little
.p2align 4,,15
.globl b64big
.type b64big, @function
b64big:
.LFB1:
.cfi_startproc
movq (%rdi), %rax
bswap %rax
ret
.cfi_endproc
.LFE1:
.size b64big, .-b64big
.p2align 4,,15
.globl b32little
.type b32little, @function
b32little:
.LFB2:
.cfi_startproc
movl (%rdi), %eax
ret
.cfi_endproc
.LFE2:
.size b32little, .-b32little
.p2align 4,,15
.globl b32big
.type b32big, @function
b32big:
.LFB3:
.cfi_startproc
movl (%rdi), %eax
bswap %eax
ret
.cfi_endproc
.LFE3:
.size b32big, .-b32big
.ident "GCC: (Debian 6.3.0-18+deb9u1) 6.3.0 20170516"
.section .note.GNU-stack,"",@progbits
so the compiler is smart enough to recognize the unaligned access code
and optimize it on platforms that support it.
MSVC doesn't seem to optimize this, but since Win32 has been built
with d_u32align=define since 2005, this change will make no
difference.
commit e8864dba80952684bf3afe83438d4eee0c3939a9
Author: Matt Turner <[email protected]>
Date: Wed Sep 4 21:48:56 2019 -0700
Clean up U8TO*_LE macro implementations
The code guarded by #ifndef U32_ALIGNMENT_REQUIRED attempts to optimize
byte-swapping by doing unaligned loads, but accessing data through
unaligned pointers is undefined behavior in C. Moreover, compilers are
more than capable of recognizing these open-coded byte-swap patterns and
emitting a bswap instruction, or an unaligned load instruction, or a
combined load, etc. There's no need for multiple paths to attain the
desired result.
See https://rt.perl.org/Ticket/Display.html?id=133495
commit ee9ac1cd8eb988fea70841eae211b11355711416
Author: Matt Turner <[email protected]>
Date: Wed Sep 4 21:04:47 2019 -0700
Digest-MD5: Consolidate byte-swapping paths
The code guarded by #ifndef U32_ALIGNMENT_REQUIRED attempts to optimize
byte-swapping by doing unaligned loads, but accessing data through
unaligned pointers is undefined behavior in C. Moreover, compilers are
more than capable of recognizing these open-coded byte-swap patterns and
emitting a bswap instruction, or an unaligned load instruction, or a
combined load, etc. There's no need for multiple paths to attain the
desired result.
See https://rt.perl.org/Ticket/Display.html?id=133495
-----------------------------------------------------------------------
--
Perl5 Master Repository