[PATCH 0/12] GCC _BitInt support [PR102989]

Jakub Jelinek via Gcc-patches Wed, 09 Aug 2023 11:14:36 -0700

Hi!

The following patch series introduces support for C23 bit-precise integer
types.  In short, they are similar to other integral types in many ways,
just aren't subject for integral promotions if smaller than int and they can
have even much wider precisions than ordinary integer types.


This series includes and thus subsumes all so far uncommitted _BitInt related
patches.  Compared to the last posted series, there is bit-field _BitInt
support, _Atomic/stdatomic.h support, conversions between _Decimal{32,64,128}
and _BitInt and vice versa (this particular item compared to what has been
posted before has a fix for the large powers of 10 computations which
with the _BitInt(575) limitation can't be really seen so far, but I've tried
to call the underlying routines with very large arrays of limbs, and in
addition to that the generated tables header has been made more compact) and
Richard's patch review feedback has been incorporated and series has been
further split into more patches.

It is enabled only on targets which have agreed on processor specific
ABI how to lay those out or pass as function arguments/return values,
which currently is just x86-64 I believe, would be nice if target maintainers
helped to get agreement on psABI changes and GCC 14 could enable it on far
more architectures than just one.

C23 says that <limits.h> defines BITINT_MAXWIDTH macro and that is the
largest supported precision of the _BitInt types, smallest is precision
of unsigned long long (but due to lack of psABI agreement we'll violate
that on architectures which don't have the support done yet).
The following series uses for the time just WIDE_INT_MAX_PRECISION as
that BITINT_MAXWIDTH, with the intent to increase it incrementally later
on.  WIDE_INT_MAX_PRECISION is 575 bits on x86_64, but will be even smaller
on lots of architectures.  This is the largest precision we can support
without changes of wide_int/widest_int representation (to make those non-POD
and allow use of some allocated buffer rather than the included fixed size
one).  Once that would be overcome, there is another internal enforced limit,
INTEGER_CST in current layout allows at most 255 64-bit limbs, which is
16320 bits as another cap.  And if that is overcome, then we have limitation
of TYPE_PRECISION being 16-bit, so 65535 as maximum precision.  Perhaps
we could make TYPE_PRECISION dependent on BITINT_TYPE vs. others and use
32-bit precision in that case later.  Latest Clang/LLVM I think supports
on paper up to 8388608 bits, but is hardly usable even with much shorter
precisions.

Besides this hopefully temporary cap on supported precision and support
only on targets which buy into it, the support has the following limitations:

- _Complex _BitInt(N) isn't supported; again mainly because none of the psABIs
  mention how those should be passed/returned; in a limited way they are
  supported internally because the internal functions into which
  __builtin_{add,sub,mul}_overflow{,_p} is lowered return COMPLEX_TYPE as a
  hack to return 2 values without using references/pointers

- vectors of _BitInt(N) aren't supported, both because psABIs don't specify
  how that works and because I'm not really sure it would be useful given
  lack of hw support for anything but bit-precise integers with the same
  bit precision as standard integer types

Because the bit-precise types have different behavior both in the C FE
(e.g. the lack of promotion) and do or can have different behavior in type
layout and function argument passing/returning values, the patch introduces
a new integral type, BITINT_TYPE, so various spots which explicitly check
for INTEGER_TYPE and not say INTEGRAL_TYPE_P macro need to be adjusted.
Also the assumption that all integral types have scalar integer type mode
is no longer true, larger BITINT_TYPEs have BLKmode type.

The patch makes 4 different categories of _BitInt depending on the target hook
decisions and their precision.  The x86-64 psABI says that _BitInt which fit
into signed/unsigned char, short, int, long and long long are laid out and
passed as those types (with padding bits undefined if they don't have mode
precision).  Such smallest precision bit-precise integer types are categorized
as small, the target hook gives for specific precision a scalar integral mode
where a single such mode contains all the bits.  Such small _BitInt types are
generally kept in the IL until expansion into RTL, with minor tweaks during
expansion to avoid relying on the padding bit values.  All larger precision
_BitInt types are supposed to be handled as structure containing an array
of limbs or so, where a limb has some integral mode (for libgcc purposes
best if it has word-size) and the limbs have either little or big endian
ordering in the array.  The padding bits in the most significant limb if any
are either undefined or should be always sign/zero extended (but support for 
this
isn't in yet, we don't know if any psABI will require it).  As mentioned in
some psABI proposals, while currently there is just one limb mode, if the limb
ordering would follow normal target endianity, there is always a possibility
to have two limb modes, one used for ABI purposes (in alignment/size decisions)
and another one used during the actual lowering or libgcc helpers.
The second _BitInt category is called medium in the series, those are _BitInt
precisions which need more than one limb, but the precision is still smaller
than TImode precision (or DImode on targets which don't support __int128).
Most arithmetics on such types can be lowered simply to casts to the 
larger/equal
precision {,unsigned} {long long,__int128} type and performing the arith on
normal integers and then casted back.  Larger _BitInt precision typically
will have BLKmode and will be lowered in a new bitintlower* pass right after
complex lowering (for -O1+ it is shortly after IPA) into series of operations
on individual limbs.  The series talks about large and huge _BitInts,
large ones are up to one bit smaller than 4 limbs and are lowered in most
places in straight line code iterating of the limbs and huge ones are those
which use some loop to handle most of the limbs and only handle up to 2 limbs
before or after the loop.

Most operations, like bitwise operations, addition, subtraction, left shift by
constant smaller than limb precision, some casts, ==/!= comparisons,
loads/stores are handled in a loop with 2 limbs per iteration followed by 0, 1
or 2 limbs handled after, are called in the series mergeable and the loop
handles perhaps many different operations with single use in the same bb.
>/>=/</<= comparisons are handled optionally together with operand casts and
loads in one optional straight line handling of most significant limb (unless
unsigned and precision is multiple of limb precision) followed by a loop 
handling
one limb at a time from more significant down to least significant.
Other operations like arbitrary left shifts or all right shifts are handled also
in a loop doing one limb at a time but accessing possibly some other limb.
Multiplication, division, modulo and floating point to/from _BitInt conversions
are handled using libgcc library routines.
__builtin_{add,sub}_overflow are handled similarly to addition/subtraction but
not mergeable with anything except implicit or explicit casts/loads and with
tracking carry at the end.
__builtin_mul_overflow is implemented by using infinite precision library
multiplication (from range info we determine ranges of operands and use possibly
a temporary array to hold large enough result) and then comparing if all bits
are zero resp. sign bit copies.

The libgcc library routines, both for multiplication, division, modulo or
conversions with floating point use a special calling convention, where for each
_BitInt a pointer to array of limbs and precision are passed.  The precision
is signed SImode, if positive, it is a known minimum precision in bits of an
unsigned operand, if it is negative, its absolute value is known minimum
precision in bits of a signed operand.  That way, the compiler using e.g. range
information can already pre-reduce precision and at runtime libgcc can reduce
it further by skipping over most significant limbs which contain just zeros or
sign bit copies.  In any case, small _BitInt types can be passed differently,
but for passing those to the libgcc routines they need to be forced into
an array of limbs as well (typically just one or two limbs).

The whole series have been successfully bootstrapped/regtested on x86_64-linux
and i686-linux.

Jakub Jelinek (12):
  expr: Small optimization [PR102989]
  lto-streamer-in: Adjust assert [PR102989]
  phiopt: Fix phiopt ICE on vops [PR102989]
  Middle-end _BitInt support [PR102989]
  _BitInt lowering support [PR102989]
  i386: Enable _BitInt on x86-64 [PR102989]
  ubsan: _BitInt -fsanitize=undefined support [PR102989]
  libgcc: Generated tables for _BitInt <-> _Decimal* conversions [PR102989]
  libgcc _BitInt support [PR102989]
  C _BitInt support [PR102989]
  testsuite part 1 for _BitInt support [PR102989]
  testsuite part 2 for _BitInt support [PR102989]

 gcc/Makefile.in                                  |    1 
 gcc/builtins.cc                                  |    7 
 gcc/c-family/c-common.cc                         |  261 
 gcc/c-family/c-common.h                          |    2 
 gcc/c-family/c-cppbuiltin.cc                     |   23 
 gcc/c-family/c-lex.cc                            |  164 
 gcc/c-family/c-pretty-print.cc                   |   32 
 gcc/c-family/c-ubsan.cc                          |    4 
 gcc/c/c-convert.cc                               |    1 
 gcc/c/c-decl.cc                                  |  194 
 gcc/c/c-parser.cc                                |   27 
 gcc/c/c-tree.h                                   |   18 
 gcc/c/c-typeck.cc                                |  132 
 gcc/cfgexpand.cc                                 |    4 
 gcc/config/i386/i386.cc                          |   33 
 gcc/convert.cc                                   |    8 
 gcc/doc/generic.texi                             |    9 
 gcc/doc/tm.texi                                  |   15 
 gcc/doc/tm.texi.in                               |    2 
 gcc/dwarf2out.cc                                 |   43 
 gcc/expr.cc                                      |   71 
 gcc/fold-const.cc                                |   75 
 gcc/gimple-expr.cc                               |    9 
 gcc/gimple-fold.cc                               |   82 
 gcc/gimple-lower-bitint.cc                       | 6074 +++++++++++++++++++++++
 gcc/gimple-lower-bitint.h                        |   31 
 gcc/glimits.h                                    |    5 
 gcc/internal-fn.cc                               |  145 
 gcc/internal-fn.def                              |    6 
 gcc/internal-fn.h                                |    4 
 gcc/lto-streamer-in.cc                           |    2 
 gcc/match.pd                                     |    1 
 gcc/passes.def                                   |    3 
 gcc/pretty-print.h                               |   19 
 gcc/stor-layout.cc                               |   86 
 gcc/target.def                                   |   19 
 gcc/target.h                                     |   14 
 gcc/targhooks.cc                                 |    8 
 gcc/targhooks.h                                  |    1 
 gcc/testsuite/gcc.dg/atomic/stdatomic-bitint-1.c |  442 +
 gcc/testsuite/gcc.dg/atomic/stdatomic-bitint-2.c |  450 +
 gcc/testsuite/gcc.dg/bitint-1.c                  |   26 
 gcc/testsuite/gcc.dg/bitint-10.c                 |   15 
 gcc/testsuite/gcc.dg/bitint-11.c                 |    9 
 gcc/testsuite/gcc.dg/bitint-12.c                 |   31 
 gcc/testsuite/gcc.dg/bitint-13.c                 |   17 
 gcc/testsuite/gcc.dg/bitint-14.c                 |   11 
 gcc/testsuite/gcc.dg/bitint-15.c                 |   10 
 gcc/testsuite/gcc.dg/bitint-16.c                 |   31 
 gcc/testsuite/gcc.dg/bitint-17.c                 |   47 
 gcc/testsuite/gcc.dg/bitint-18.c                 |   44 
 gcc/testsuite/gcc.dg/bitint-2.c                  |  116 
 gcc/testsuite/gcc.dg/bitint-3.c                  |   40 
 gcc/testsuite/gcc.dg/bitint-4.c                  |   39 
 gcc/testsuite/gcc.dg/bitint-5.c                  |   63 
 gcc/testsuite/gcc.dg/bitint-6.c                  |   15 
 gcc/testsuite/gcc.dg/bitint-7.c                  |   16 
 gcc/testsuite/gcc.dg/bitint-8.c                  |   34 
 gcc/testsuite/gcc.dg/bitint-9.c                  |   52 
 gcc/testsuite/gcc.dg/dfp/bitint-1.c              |   98 
 gcc/testsuite/gcc.dg/dfp/bitint-2.c              |   91 
 gcc/testsuite/gcc.dg/dfp/bitint-3.c              |   98 
 gcc/testsuite/gcc.dg/dfp/bitint-4.c              |  156 
 gcc/testsuite/gcc.dg/dfp/bitint-5.c              |  159 
 gcc/testsuite/gcc.dg/dfp/bitint-6.c              |  156 
 gcc/testsuite/gcc.dg/torture/bitint-1.c          |  114 
 gcc/testsuite/gcc.dg/torture/bitint-10.c         |   38 
 gcc/testsuite/gcc.dg/torture/bitint-11.c         |   77 
 gcc/testsuite/gcc.dg/torture/bitint-12.c         |  128 
 gcc/testsuite/gcc.dg/torture/bitint-13.c         |  171 
 gcc/testsuite/gcc.dg/torture/bitint-14.c         |  140 
 gcc/testsuite/gcc.dg/torture/bitint-15.c         |  264 
 gcc/testsuite/gcc.dg/torture/bitint-16.c         |  385 +
 gcc/testsuite/gcc.dg/torture/bitint-17.c         |   82 
 gcc/testsuite/gcc.dg/torture/bitint-18.c         |  117 
 gcc/testsuite/gcc.dg/torture/bitint-19.c         |  190 
 gcc/testsuite/gcc.dg/torture/bitint-2.c          |  118 
 gcc/testsuite/gcc.dg/torture/bitint-20.c         |  190 
 gcc/testsuite/gcc.dg/torture/bitint-21.c         |  282 +
 gcc/testsuite/gcc.dg/torture/bitint-22.c         |  282 +
 gcc/testsuite/gcc.dg/torture/bitint-23.c         |  804 +++
 gcc/testsuite/gcc.dg/torture/bitint-24.c         |  804 +++
 gcc/testsuite/gcc.dg/torture/bitint-25.c         |   91 
 gcc/testsuite/gcc.dg/torture/bitint-26.c         |   66 
 gcc/testsuite/gcc.dg/torture/bitint-27.c         |  373 +
 gcc/testsuite/gcc.dg/torture/bitint-28.c         |   20 
 gcc/testsuite/gcc.dg/torture/bitint-29.c         |   24 
 gcc/testsuite/gcc.dg/torture/bitint-3.c          |  134 
 gcc/testsuite/gcc.dg/torture/bitint-30.c         |   19 
 gcc/testsuite/gcc.dg/torture/bitint-31.c         |   23 
 gcc/testsuite/gcc.dg/torture/bitint-32.c         |   24 
 gcc/testsuite/gcc.dg/torture/bitint-33.c         |   24 
 gcc/testsuite/gcc.dg/torture/bitint-34.c         |   24 
 gcc/testsuite/gcc.dg/torture/bitint-35.c         |   23 
 gcc/testsuite/gcc.dg/torture/bitint-36.c         |   23 
 gcc/testsuite/gcc.dg/torture/bitint-37.c         |   23 
 gcc/testsuite/gcc.dg/torture/bitint-38.c         |   56 
 gcc/testsuite/gcc.dg/torture/bitint-39.c         |   57 
 gcc/testsuite/gcc.dg/torture/bitint-4.c          |  134 
 gcc/testsuite/gcc.dg/torture/bitint-40.c         |   40 
 gcc/testsuite/gcc.dg/torture/bitint-41.c         |   34 
 gcc/testsuite/gcc.dg/torture/bitint-42.c         |  184 
 gcc/testsuite/gcc.dg/torture/bitint-5.c          |  359 +
 gcc/testsuite/gcc.dg/torture/bitint-6.c          |  359 +
 gcc/testsuite/gcc.dg/torture/bitint-7.c          |  386 +
 gcc/testsuite/gcc.dg/torture/bitint-8.c          |  391 +
 gcc/testsuite/gcc.dg/torture/bitint-9.c          |  391 +
 gcc/testsuite/gcc.dg/ubsan/bitint-1.c            |   49 
 gcc/testsuite/gcc.dg/ubsan/bitint-2.c            |   49 
 gcc/testsuite/gcc.dg/ubsan/bitint-3.c            |   45 
 gcc/testsuite/lib/target-supports.exp            |   27 
 gcc/tree-pass.h                                  |    3 
 gcc/tree-pretty-print.cc                         |   23 
 gcc/tree-ssa-coalesce.cc                         |  148 
 gcc/tree-ssa-live.cc                             |    8 
 gcc/tree-ssa-live.h                              |    8 
 gcc/tree-ssa-phiopt.cc                           |    1 
 gcc/tree-ssa-sccvn.cc                            |   11 
 gcc/tree-switch-conversion.cc                    |   71 
 gcc/tree.cc                                      |   67 
 gcc/tree.def                                     |    9 
 gcc/tree.h                                       |   94 
 gcc/typeclass.h                                  |    3 
 gcc/ubsan.cc                                     |   89 
 gcc/ubsan.h                                      |    3 
 gcc/varasm.cc                                    |   55 
 gcc/vr-values.cc                                 |   27 
 libcpp/expr.cc                                   |   29 
 libcpp/include/cpplib.h                          |    1 
 libgcc/Makefile.in                               |    5 
 libgcc/config/aarch64/t-softfp                   |    2 
 libgcc/config/i386/64/t-softfp                   |    2 
 libgcc/config/i386/libgcc-glibc.ver              |   10 
 libgcc/config/i386/t-softfp                      |    5 
 libgcc/config/riscv/t-softfp32                   |    6 
 libgcc/config/rs6000/t-e500v1-fp                 |    2 
 libgcc/config/rs6000/t-e500v2-fp                 |    2 
 libgcc/config/t-softfp                           |   12 
 libgcc/config/t-softfp-sfdftf                    |    1 
 libgcc/config/t-softfp-tf                        |    1 
 libgcc/libgcc-std.ver.in                         |   10 
 libgcc/libgcc2.c                                 |  681 ++
 libgcc/libgcc2.h                                 |   15 
 libgcc/soft-fp/bitint.h                          |  329 +
 libgcc/soft-fp/bitintpow10.c                     |  132 
 libgcc/soft-fp/bitintpow10.h                     | 4947 ++++++++++++++++++
 libgcc/soft-fp/fixddbitint.c                     |  205 
 libgcc/soft-fp/fixdfbitint.c                     |   71 
 libgcc/soft-fp/fixsdbitint.c                     |  196 
 libgcc/soft-fp/fixsfbitint.c                     |   71 
 libgcc/soft-fp/fixtdbitint.c                     |  242 
 libgcc/soft-fp/fixtfbitint.c                     |   81 
 libgcc/soft-fp/fixxfbitint.c                     |   82 
 libgcc/soft-fp/floatbitintbf.c                   |   59 
 libgcc/soft-fp/floatbitintdd.c                   |  264 
 libgcc/soft-fp/floatbitintdf.c                   |   64 
 libgcc/soft-fp/floatbitinthf.c                   |   59 
 libgcc/soft-fp/floatbitintsd.c                   |  235 
 libgcc/soft-fp/floatbitintsf.c                   |   59 
 libgcc/soft-fp/floatbitinttd.c                   |  271 +
 libgcc/soft-fp/floatbitinttf.c                   |   73 
 libgcc/soft-fp/floatbitintxf.c                   |   74 
 libgcc/soft-fp/op-common.h                       |   31 
 163 files changed, 26268 insertions(+), 220 deletions(-)

        Jakub

[PATCH 0/12] GCC _BitInt support [PR102989]

Reply via email to