Following case attempts to set floating point control register and
execute floating point operation afterward. However, it doesn't works
as expected with -Os, as GCC hoists multiply operation beyond FP
control register setting.

As there is no register dependence between __set_FPSCR and multiply,
hoisting can happen. There is structure dependence indeed but can't be
expressed in GCC semantic.

How about the idea to provide some kind of barrier that can prevent
such a hoisting from happening?

int ftz;
float foo(float a, float b)
{
    float r;
    unsigned fpscr_orig = __get_FPSCR();
    if (ftz) {
        __set_FPSCR(fpscr_orig | 0x1000000);
        r = a * b;
    }
    else {
        __set_FPSCR(fpscr_orig & ~0x1000000);
        r = a * b;
    }
    __set_FPSCR(fpscr_orig);
    return r;
}

Reply via email to