On 1/15/24 06:34, Richard Biener wrote:
When the x86 backend generates code for cpymem with the rep_8byte strathegy for the 8 byte aligned main rep movq it needs to compute an adjusted pointer to the source after doing a prologue aligning the destination. It computes that via src_ptr + (dest_ptr - orig_dest_ptr) which is perfectly fine. On RTL this is then 8: r134:DI=const(`g'+0x44) 9: {r133:DI=frame:DI-0x4c;clobber flags:CC;} REG_UNUSED flags:CC 56: r129:DI=const(`g'+0x4c) 57: {r129:DI=r129:DI&0xfffffffffffffff8;clobber flags:CC;} REG_UNUSED flags:CC REG_EQUAL const(`g'+0x4c)&0xfffffffffffffff8 58: {r118:DI=r134:DI-r129:DI;clobber flags:CC;} REG_DEAD r134:DI REG_UNUSED flags:CC REG_EQUAL const(`g'+0x44)-r129:DI 59: {r119:DI=r133:DI-r118:DI;clobber flags:CC;} REG_DEAD r133:DI REG_UNUSED flags:CC but as written find_base_term happily picks the first candidate it finds for the MINUS which means it picks const(`g') rather than the correct frame:DI. This way find_base_term (but also the unfixed find_base_value used by init_alias_analysis to initialize REG_BASE_VALUE) performs pointer analysis isn't sound. The following restricts the handling of multi-operand operations to the case we know only one can be a pointer. This for example causes gcc.dg/tree-ssa/pr94969.c to miss some RTL PRE (I've opened PR113395 for this). A more drastic patch, removing base_alias_check results in only gcc.dg/guality/pr41447-1.c regressing (so testsuite coverage is bad). I've looked at gcc.dg/tree-ssa tests and mostly scheduling changes are present, the cc1plus .text size is only 230 bytes worse. With the this less drastic patch below most scheduling changes are gone. x86_64 might not the very best target to test for impact, but test coverage on other targets is unlikely to be very much better. Bootstrapped and tested on x86_64-unknown-linux-gnu (together with 2/2). Jeff, can you maybe throw this on your tester? Jakub, you did the PR64025 fix which was for a similar issue.
No issues across the cross compilers with those two patches. Jeff