https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97417
--- Comment #10 from Levy <admin at levyhsu dot com> --- Created attachment 49500 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49500&action=edit Optimzation Patch for QI/HImode(32bit) and QI/HI/SImode(64bit) Proposing second patch for QI/HImode(32bit) and QI/HI/SImode(64bit) Both Zero-Extend & Subreg Tested with make report-gcc Two failed cases: shorten-memrefs-5.c & shorten-memrefs-6.c Both were failed due to dejaGNU rule: /* { dg-final { scan-assembler "load1r:\n\taddi" } } */ But if we look at the assembly code, for same input in both file: int load1r (int *array) { int a = 0; a += array[200]; a += array[201]; a += array[202]; a += array[203]; return a; } Current gcc risc-v port will produce: load1r: addi a5,a0,768 lw a4,36(a5) lw a0,32(a5) addw a0,a0,a4 lw a4,40(a5) addw a4,a4,a0 lw a0,44(a5) addw a0,a0,a4 ret Patched new port will produce: load1r: lwu a4,800(a0) lwu a5,804(a0) addw a5,a5,a4 lwu a4,808(a0) lwu a0,812(a0) addw a5,a5,a4 addw a0,a5,a0 ret With one instruction less, the patched one seems right and even faster to me. However we still need a test on sign extend and check performance Test case and performance evaluation will be provided later (hopefully)