https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

--- Comment #56 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Created attachment 55244
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55244&action=edit
gcc14-bitint-wip-inc.patch

Incremental patch on top of the above patch.

I've tried to make some progress and implement the simplest large _BitInt
cases,
&/|/^/~, but ran into a problem there, both BIT_FIELD_REF and BIT_INSERT_EXPR
disallow
operating on non-mode precisions, while for _BitInt I think it would be really
useful
to use them on the large/huge _BitInts (which I will force into memory during
expansion most likely).  Sure, for huge _BitInts, what is handled in the loop
will use either
ARRAY_REF on VIEW_CONVERT_EXPR for operands or TARGET_MEM_REFs on VAR_DECLs for
the results in the loop, but even for those there is the partial most
significant limb in some cases that needs to be handled separately.

So, do you think it is ok to make an exception for
BIT_FIELD_REF/BIT_INSERT_EXPR and
allow them on non-mode precision BITINT_TYPEs (the incremental patch enables
that) plus
handle it during the expansion?

Another thing, started to think about PLUS_EXPR/MINUS_EXPR, we have
__builtin_ia32_addcarryx_u64/__builtin_ia32_sbb_u64 builtins on x86-64, but
from what
I can see don't really pattern recognize even simple add + adc.

Given:
void
foo (unsigned long *p, unsigned long *q, unsigned long *r)
{
  unsigned long p0 = p[0], q0 = q[0];
  unsigned long p1 = p[1], q1 = q[1];
  unsigned long r0 = p0 + q0;
  unsigned long r1 = p1 + q1 + (r0 < p0);
  r[0] = r0;
  r[1] = r1;
}

void
bar (unsigned long *p, unsigned long *q, unsigned long *r)
{
  unsigned long p0 = p[0], q0 = q[0];
  unsigned long p1 = p[1], q1 = q[1];
  unsigned long p2 = p[2], q2 = q[2];
  unsigned long r0 = p0 + q0;
  unsigned long r1 = p1 + q1 + (r0 < p0);
  unsigned long r2 = p2 + q2 + (r1 < p1 || r1 < q1);
  r[0] = r0;
  r[1] = r1;
  r[2] = r2;
}

llvm seems to pattern recognize foo, but doesn't pattern recognize bar as add;
adc; adc
(is that actually a correct C for that though?).

So, shouldn't we implement the clang's
https://clang.llvm.org/docs/LanguageExtensions.html#multiprecision-arithmetic-builtins
builtins (add least the __builtin_{add,sub}c{,l,ll} builtins), lower them into
ifns early (similarly to .{ADD,SUB}_OVERFLOW returning complex integer with 2
returns) and add optabs so that targets can implement those efficiently?

Reply via email to