https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93334

            Bug ID: 93334
           Summary: -O3 generates useless code checking for overlapping
                    memset ?
           Product: gcc
           Version: 9.2.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c
          Assignee: unassigned at gcc dot gnu.org
          Reporter: nathanael.schaeffer at gmail dot com
  Target Milestone: ---

It seems that trying to zero out two arrays in the same loop results in poor
code beeing generated by -O3.
If I understand it correctly, the generated code tries to identify if the
arrays overlap. If it is the case the code then falls back to simple loops
instead of calls to memset.

I wonder why overlapping memset is an issue?
I this some inherited behaviour from dealing with memcpy?

In case 4 arrays are zeroed together, about 40 instructions are generated to
check for mutual overlap... This does not seem to be necessary.
Other compilers (clang, icc) don't do that.

See issue here, with assembly generated:
https://godbolt.org/z/SSWVhm

And I copy the code below for reference too:

void test_simple_code(long l, double* mem, long ofs2) {
        for (long k=0; k<l; k++) {
                        mem[k] = 0.0;
                        mem[ofs2 +k] = 0.0;
        }
}

void test_crazy_code(long l, double* mem, long ofs2, long ofs3, long ofs4) {
        for (long k=0; k<l; k++) {
                        mem[k] = 0.0;
                        mem[ofs2 +k] = 0.0;
                        mem[ofs3 +k] = 0.0;
                        mem[ofs4 +k] = 0.0;
        }
}

void test_ok_code(long l, double* mem, long ofs2, long ofs3, long ofs4) {
        for (long k=0; k<l; k++)
                        mem[k] = 0.0;
        for (long k=0; k<l; k++)
                        mem[ofs2 +k] = 0.0;
        for (long k=0; k<l; k++)
                        mem[ofs3 +k] = 0.0;
        for (long k=0; k<l; k++)
                        mem[ofs4 +k] = 0.0;
}

Reply via email to