Following case attempts to set floating point control register and execute floating point operation afterward. However, it doesn't works as expected with -Os, as GCC hoists multiply operation beyond FP control register setting.
As there is no register dependence between __set_FPSCR and multiply, hoisting can happen. There is structure dependence indeed but can't be expressed in GCC semantic. How about the idea to provide some kind of barrier that can prevent such a hoisting from happening? int ftz; float foo(float a, float b) { float r; unsigned fpscr_orig = __get_FPSCR(); if (ftz) { __set_FPSCR(fpscr_orig | 0x1000000); r = a * b; } else { __set_FPSCR(fpscr_orig & ~0x1000000); r = a * b; } __set_FPSCR(fpscr_orig); return r; }