> We want to make sure that GCC puts things in the right order. I > suppose that even a memory clobber is insufficient here, though.
amdgpu worked around that by using a noinline function: https://github.com/torvalds/linux/commit/59dfb0c64d3853d20dc84f4561f28d4f5a2ddc7d#diff-a82b8ab0e6b4f9abfd3344d1427d765f If there is something that would help, it would have to be an attribute added to kernel_fpu_begin() in the header. Any barriers inside the function will be invisible to code generation in other compilation units. -- Petteri

