Branch: refs/heads/blead
Home: https://github.com/Perl/perl5
Commit: 4430315acc907d880b932bfc3e2808114f35bc95
https://github.com/Perl/perl5/commit/4430315acc907d880b932bfc3e2808114f35bc95
Author: Karl Williamson <[email protected]>
Date: 2023-12-12 (Tue, 12 Dec 2023)
Changed paths:
M inline.h
Log Message:
-----------
inline.h Move finding bit pos fallback code to earlier
Some, but certainly not all, hardware has instructions that find the
position of the highest and/or lowest 1 bit in a word. Others can count
the number of leading or trailing zeros, which leads to the same
information. On some of the platforms that do, the compiler allows C
code to specify to use them.
Perl has code that can take advantage of this information for some
common compilers, but importantly, has fallback code to relatively
quickly get the same results for hardware that doesn't have these
instructions, or for compilers that we don't know how to get to use
those instructions.
The instructions tend to be word-size specific, with one for a 32-bit
word and/or one for a 64-bit one. It might be that a 64-bit platform
has an instruction for only that size, and not for a 32-bit quantity.
But getting the compiler to widen a 32-bit value into a 64-bit one
enables the 64-bit instruction to be used.
This commit causes that widening to occur when the 32 bit version isn't
available, but the 64 bit one is.
Sometimes we have available an instruction for finding the most
significant set bit, but not the least; or vice versa. The fallback
code when we don't have an appropriate instruction available for the
task at hand is to convert the input to having just a single bit set, in
the position we are looking for. This can be done with a few
bit-oriented instructions. That single bit will be both the most and
least significant bit in the word, so we can use the instruction that is
available on the box to get the desired answer.
(If neither type of instruction is available, the position can be
calculated on words containing just a single set bit by certain known
magic numbers (deBruijn sequences) using table lookup and integer
multiplying/shifting.)
Prior to this commit, that fallback code would try widening the
parameter in preference to deBruijn. So it would try widening, but only
after isolating to a single bit. After this commit, the widening is
tried first, which is less work. And the code that would widen after
converting to single bit will never get compiled due to the preprocessor
directives added in this commit, so is hereby removed.
The commit removes the comment about why it doesn't use two 32-bit
halves to emulate a 64-bit word when the latter instruction is missing.
That would be even more cumbersome than I had thought due to the
possibility that all the set bits might be in just one of the halves,
and the machine instructions require a non-zero word.