https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989
--- Comment #56 from Jakub Jelinek <jakub at gcc dot gnu.org> --- Created attachment 55244 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55244&action=edit gcc14-bitint-wip-inc.patch Incremental patch on top of the above patch. I've tried to make some progress and implement the simplest large _BitInt cases, &/|/^/~, but ran into a problem there, both BIT_FIELD_REF and BIT_INSERT_EXPR disallow operating on non-mode precisions, while for _BitInt I think it would be really useful to use them on the large/huge _BitInts (which I will force into memory during expansion most likely). Sure, for huge _BitInts, what is handled in the loop will use either ARRAY_REF on VIEW_CONVERT_EXPR for operands or TARGET_MEM_REFs on VAR_DECLs for the results in the loop, but even for those there is the partial most significant limb in some cases that needs to be handled separately. So, do you think it is ok to make an exception for BIT_FIELD_REF/BIT_INSERT_EXPR and allow them on non-mode precision BITINT_TYPEs (the incremental patch enables that) plus handle it during the expansion? Another thing, started to think about PLUS_EXPR/MINUS_EXPR, we have __builtin_ia32_addcarryx_u64/__builtin_ia32_sbb_u64 builtins on x86-64, but from what I can see don't really pattern recognize even simple add + adc. Given: void foo (unsigned long *p, unsigned long *q, unsigned long *r) { unsigned long p0 = p[0], q0 = q[0]; unsigned long p1 = p[1], q1 = q[1]; unsigned long r0 = p0 + q0; unsigned long r1 = p1 + q1 + (r0 < p0); r[0] = r0; r[1] = r1; } void bar (unsigned long *p, unsigned long *q, unsigned long *r) { unsigned long p0 = p[0], q0 = q[0]; unsigned long p1 = p[1], q1 = q[1]; unsigned long p2 = p[2], q2 = q[2]; unsigned long r0 = p0 + q0; unsigned long r1 = p1 + q1 + (r0 < p0); unsigned long r2 = p2 + q2 + (r1 < p1 || r1 < q1); r[0] = r0; r[1] = r1; r[2] = r2; } llvm seems to pattern recognize foo, but doesn't pattern recognize bar as add; adc; adc (is that actually a correct C for that though?). So, shouldn't we implement the clang's https://clang.llvm.org/docs/LanguageExtensions.html#multiprecision-arithmetic-builtins builtins (add least the __builtin_{add,sub}c{,l,ll} builtins), lower them into ifns early (similarly to .{ADD,SUB}_OVERFLOW returning complex integer with 2 returns) and add optabs so that targets can implement those efficiently?