Branch: refs/heads/smoke-me/khw-clz
Home: https://github.com/Perl/perl5
Commit: 54a86071331c42bede914b5695f13e7f1b401bc6
https://github.com/Perl/perl5/commit/54a86071331c42bede914b5695f13e7f1b401bc6
Author: Karl Williamson <[email protected]>
Date: 2021-06-12 (Sat, 12 Jun 2021)
Changed paths:
M regcomp.c
Log Message:
-----------
regcomp.c: White-space only
My attempt to insulate from the leading tab removal the year-old commits
finally pushed as 77a6d54c0deb1165b37dcf11c21cd334ae2579bb and
403d7eb3e4320188571cf61b9dab62ff10799f49 failed miserably.
I think it is some bug in git. Seemingly random groups of lines were
indented differently than adjacent ones.
Anyway, I spent a bunch of time sorting it all out, and this is the
result.
Commit: 6816e8ecec186d3b686e7b99240ba40cccb22eeb
https://github.com/Perl/perl5/commit/6816e8ecec186d3b686e7b99240ba40cccb22eeb
Author: Hugo van der Sanden <[email protected]>
Date: 2021-06-12 (Sat, 12 Jun 2021)
Changed paths:
M regcomp.c
Log Message:
-----------
regcomp.c: comments
Comment change suggestions from @hvds in PR #18835.
Commit: cefcda2bebf03e07db81aa2c14caeefb2370e78c
https://github.com/Perl/perl5/commit/cefcda2bebf03e07db81aa2c14caeefb2370e78c
Author: Karl Williamson <[email protected]>
Date: 2021-06-12 (Sat, 12 Jun 2021)
Changed paths:
M perl.h
Log Message:
-----------
Fix ASSUME definition
Commit 5d5b9c460e2a06563d2b5e35a1a79991460696eb fixed the definition of
ASSUME for some purposes, but broke it on Windows. This commit should
fix that. It also adds support for clang's __builtin_assume()
Commit: 89a0d3b24c5c68e52ebabd65395fb891a7266f43
https://github.com/Perl/perl5/commit/89a0d3b24c5c68e52ebabd65395fb891a7266f43
Author: Karl Williamson <[email protected]>
Date: 2021-06-12 (Sat, 12 Jun 2021)
Changed paths:
M utf8.h
Log Message:
-----------
utf8.h: Refactor UTF8_IS_INVARIANT
This commit changes nothing on ASCII platforms, but potentially saves a
branch on EBCDIC ones by using the fact that the input is a single byte;
we don't have to check if it is bigger than a byte.
Commit: 56deeb090b09aeff35be357d74e5043f9cf8653a
https://github.com/Perl/perl5/commit/56deeb090b09aeff35be357d74e5043f9cf8653a
Author: Karl Williamson <[email protected]>
Date: 2021-06-12 (Sat, 12 Jun 2021)
Changed paths:
M utf8.h
Log Message:
-----------
Refactor UTF_START_MASK()
A slight change to this very low level macro (hence called a lot)
removes the need for a conditional, and causes it to work on single-byte
UTF-8 characters on ASCII platforms
Commit: b337367206ff47ec777f970eef3d2dc1c49b902d
https://github.com/Perl/perl5/commit/b337367206ff47ec777f970eef3d2dc1c49b902d
Author: Karl Williamson <[email protected]>
Date: 2021-06-12 (Sat, 12 Jun 2021)
Changed paths:
M inline.h
Log Message:
-----------
Perl_variant_byte_number: Generalize
The current mechanism doesn't work if the lowest bit is the one set. At
the moment that doesn't matter as we aren't looking at that bit anyway.
But a future commit will refactor things so that bit will be looked at.
So prepare for that. The new expression is simpler, besides.
Commit: d92503342d820d2a52154c0b1529e97576a5cb22
https://github.com/Perl/perl5/commit/d92503342d820d2a52154c0b1529e97576a5cb22
Author: Karl Williamson <[email protected]>
Date: 2021-06-12 (Sat, 12 Jun 2021)
Changed paths:
M inline.h
Log Message:
-----------
Perl_variant_byte_number: Move assert()
This should be called only when it is known there is a variant byte.
The assert() previously wasn't checking that precisely
Commit: 2ed8aa162dbb03e10c2c69b7d61b0a6e5c0e7679
https://github.com/Perl/perl5/commit/2ed8aa162dbb03e10c2c69b7d61b0a6e5c0e7679
Author: Karl Williamson <[email protected]>
Date: 2021-06-12 (Sat, 12 Jun 2021)
Changed paths:
M perl.h
Log Message:
-----------
perl.h: Create PERL_UINTMAX_SIZE
This is the sizeof() the widest unsigned available on the platform.
This #define is so that the value can be used in preprocessor
expressions
Commit: 2f1acd3852d8645536354bd582f0cbcb2721c5c0
https://github.com/Perl/perl5/commit/2f1acd3852d8645536354bd582f0cbcb2721c5c0
Author: Karl Williamson <[email protected]>
Date: 2021-06-12 (Sat, 12 Jun 2021)
Changed paths:
M embed.fnc
M embed.h
M globvar.sym
M inline.h
M perl.h
M proto.h
Log Message:
-----------
Create single_1bit_pos()
Given a word known to have exactly a single bit set, this function uses
deBruijn sequences to quickly calculate its position.
I took the 32 and 64 bit word versions from the internet. They differ
in how they treat a word with no set bits. But this is considered
undefined behavior, so that difference is immaterial.
Apparently figuring this out uses brute force methods, and so I decided
to live with this difference, rather than to expend the time needed to
bring them into sync.
Commit: c30616279347670f016f332b507c3af612116252
https://github.com/Perl/perl5/commit/c30616279347670f016f332b507c3af612116252
Author: Karl Williamson <[email protected]>
Date: 2021-06-12 (Sat, 12 Jun 2021)
Changed paths:
M regcomp.c
Log Message:
-----------
regcomp.c: Use single_1bit_pos()
This new function replaces the code in this file. The new function will
operate on 64-bit words, whereas only 32 bits are required here. But
there should be no slowdown.
Commit: dd9708f865d07c83dbc0f3ddfbf96b47df1f8587
https://github.com/Perl/perl5/commit/dd9708f865d07c83dbc0f3ddfbf96b47df1f8587
Author: Karl Williamson <[email protected]>
Date: 2021-06-12 (Sat, 12 Jun 2021)
Changed paths:
M embed.fnc
M embed.h
M inline.h
M proto.h
Log Message:
-----------
Create and use my_ffs()
This is a function to return the position of the least significant 1 bit
in a word. In this commit it just isolates that bit and uses the
function single_1bit_pos() created 2 commits ago.
Commit: fcd133cbc3e01e9bf62efd6a4bde4d8a5de1779f
https://github.com/Perl/perl5/commit/fcd133cbc3e01e9bf62efd6a4bde4d8a5de1779f
Author: Karl Williamson <[email protected]>
Date: 2021-06-12 (Sat, 12 Jun 2021)
Changed paths:
M embed.fnc
M embed.h
M inline.h
M proto.h
Log Message:
-----------
Create and use my_msbit_pos()
This inline function finds the bit position of the most significant 1
bit in a word. It basically separates out code from another inline
function, so that it can be called more generally.
I took the liberty of clarifying the comment and explicitly #ifdef'ing
out a line of code instead of relying on the optimizer to skip it.
Commit: b4df92bfa61fca110144fd9f1c640fff1c5b91fb
https://github.com/Perl/perl5/commit/b4df92bfa61fca110144fd9f1c640fff1c5b91fb
Author: Karl Williamson <[email protected]>
Date: 2021-06-12 (Sat, 12 Jun 2021)
Changed paths:
M perl.h
Log Message:
-----------
Check for clz clzl availability
This defines PERL_USE_CLZ() to be the proper function to call on a
system to execute the equivalent of the count leading zero instruction
on the widest machine word. It is left undefined if it couldn't find an
appropriate call.
clzll also exists, for long long, but I don't think we have long longs
different from plain long on platforms besides Windows, which doesn't
have this anyway.
Commit: d9f25f347bac70536961580bf81051568c83857d
https://github.com/Perl/perl5/commit/d9f25f347bac70536961580bf81051568c83857d
Author: Karl Williamson <[email protected]>
Date: 2021-06-12 (Sat, 12 Jun 2021)
Changed paths:
M inline.h
Log Message:
-----------
Use 'count leading zeros', if have it
This changes two functions to use clz if available. On platforms with
that instruction, the expansion is likely to just it and a subtract,
beating the deBruijn method previously used.
And if that instruction isn't present, the libc emulation is likely to
be as fast as possible, again beating the hand-rolled deBruijn method.
Commit: a3c6cc360c0ceb713dc77b738721c1fd18385daf
https://github.com/Perl/perl5/commit/a3c6cc360c0ceb713dc77b738721c1fd18385daf
Author: Karl Williamson <[email protected]>
Date: 2021-06-12 (Sat, 12 Jun 2021)
Changed paths:
M perl.h
Log Message:
-----------
Add macros to call the appropriate ffs() function
ffs finds the least significant 1 bit. POSIX defines it only for
integer arguments, but many implementations have a version for longs as
well.
This commit defines a macro for the appropriate one for this platform,
if any. Future commits will use this.
Commit: 44b042c5ef18f5c425034538904eeea2b87174b9
https://github.com/Perl/perl5/commit/44b042c5ef18f5c425034538904eeea2b87174b9
Author: Karl Williamson <[email protected]>
Date: 2021-06-12 (Sat, 12 Jun 2021)
Changed paths:
M inline.h
Log Message:
-----------
Use ffs() if available
This POSIX function is likely to be faster than our hand-rolled code.
Commit: a02ef6ae3af0a4d022c79b37b449deb122768f0a
https://github.com/Perl/perl5/commit/a02ef6ae3af0a4d022c79b37b449deb122768f0a
Author: Karl Williamson <[email protected]>
Date: 2021-06-12 (Sat, 12 Jun 2021)
Changed paths:
M globvar.sym
M utf8.h
Log Message:
-----------
Reimplement OFFUNISKIP
Now that previous commits have made it fast to find the position of the
first set bit in a word, we can use a table lookup of that to find how
many bytes the UTF-8 of that will occupy. This allows for
simplification of this function.
The table is 64 bytes long on a 64-bit machine, each use saving the need
to have 7 or so conditionals generated.
This base level function is called by higher level macros, like
UVCHR_SKIP
Commit: 037bf130684644ec4ecd3259a6afe00d708880f8
https://github.com/Perl/perl5/commit/037bf130684644ec4ecd3259a6afe00d708880f8
Author: Karl Williamson <[email protected]>
Date: 2021-06-12 (Sat, 12 Jun 2021)
Changed paths:
M embed.fnc
M proto.h
M utf8.c
Log Message:
-----------
utf8.c: Change formal parameter name to fcn
This will make more sense of the next commit
Commit: 65e8d608f92b18947959bc653f9db192c116fa90
https://github.com/Perl/perl5/commit/65e8d608f92b18947959bc653f9db192c116fa90
Author: Karl Williamson <[email protected]>
Date: 2021-06-12 (Sat, 12 Jun 2021)
Changed paths:
M utf8.c
Log Message:
-----------
Refactor uvoffuni_to_utf8_flags_msgs
Having a fast UVOFFUNISKIP() allows this function be be refactored to
simplify it.
This shortchanges EBCDIC by a little. For example, it checks if a
4-byte character is above Unicode, but no 4-byte characters fit that
description in UTF-EBCDIC. If that were ever a concern, the tests for
that and for non-characters could be extracted out into static
functions, and their calls #ifdefd to the proper length case statement.
Commit: 8276b3aebc952fe3c79601c9e1eefc59732357bb
https://github.com/Perl/perl5/commit/8276b3aebc952fe3c79601c9e1eefc59732357bb
Author: Karl Williamson <[email protected]>
Date: 2021-06-12 (Sat, 12 Jun 2021)
Changed paths:
M regcomp.h
Log Message:
-----------
regcomp.h: Add internal macro
This returns the locale bitmap field from an ANYOF node.
In this commit, it just tidies up the code, omitting lengthy casts. But
a future commit will use it on its own.
Commit: 79fe698fd8683c55b9c7fa4bd402742a263f2dc9
https://github.com/Perl/perl5/commit/79fe698fd8683c55b9c7fa4bd402742a263f2dc9
Author: Karl Williamson <[email protected]>
Date: 2021-06-12 (Sat, 12 Jun 2021)
Changed paths:
M regexec.c
Log Message:
-----------
regexec.c: Use new bit fcns to avoid iterations
Before this commit, the code looped through a bitmap looking for a set
bit. Now that we have fast ways to find where a set bit is, use them,
and avoid the fruitless iterations.
Commit: 55c27cc58fac32756b3c211750bfbf18fba878e2
https://github.com/Perl/perl5/commit/55c27cc58fac32756b3c211750bfbf18fba878e2
Author: Karl Williamson <[email protected]>
Date: 2021-06-12 (Sat, 12 Jun 2021)
Changed paths:
M charclass_invlists.h
M lib/unicore/uni_keywords.pl
M regcharclass.h
M regen/charset_translations.pl
M uni_keywords.h
Log Message:
-----------
regen/charset_translations.pl: Use revised macros
The previous two commits have revised two UTF-8 macros. This perl file
emulates those macros; change it to use the new definitions.
Commit: 5560481701488217416df1b32a29b9a4e3092cd3
https://github.com/Perl/perl5/commit/5560481701488217416df1b32a29b9a4e3092cd3
Author: Karl Williamson <[email protected]>
Date: 2021-06-13 (Sun, 13 Jun 2021)
Changed paths:
M globvar.sym
M inline.h
M regcomp.c
M utf8.c
M utf8.h
M utfebcdic.h
Log Message:
-----------
smoke
Compare: https://github.com/Perl/perl5/compare/5e7525dc5ce0...556048170148