Yury Gribov <y.gri...@samsung.com> writes: > Richard Biener wrote: >>>> If this behavior is not intended, what would be the best way to fix >>>> performance? I could teach GCC to not remove constant RTXs in >>>> flush_hash_table() but this is probably very naive and won't cover some >>>> corner-cases. >>> >>> That could be a good starting point though. >> >> Though with modifying "machine state" you can modify constants as well, no? > > Valid point but this would mean relying on compiler to always load all > constants from memory (instead of, say, generating them via movhi/movlo) > for a piece of code which looks extremely unstable.
Right. And constant rtx codes have mode-independent semantics. (const_int 1) is always 1, whatever a volatile asm does. Same for const_double, symbol_ref, label_ref, etc. If a constant load is implemented using some mode-dependent operation then it would need to be represented as something like an unspec instead. But even then, the result would usually be annotated with a REG_EQUAL note giving the value of the final register result. It should be perfectly OK to reuse that register after a volatile asm if the value in the REG_EQUAL note is needed again. > What is the general attitude towards volatile asm? Are people interested > in making it more defined/performant or should we just leave this can of > worms as is? I can try to improve generated code but my patches will be > doomed if there is no consensus on what volatile asm actually means... I think part of the problem is that some parts of GCC (like the one you noted) are far more conservative than others. E.g. take: void foo (int x, int *y) { y[0] = x + 1; asm volatile ("# asm"); y[1] = x + 1; } The extra-paranoid check you pointed out means that we assume that x + 1 is no longer available after the asm for rtx-level CSE, but take the opposite view for tree-level CSE, which happily optimises away the second +. Some places were (maybe still are) worried that volatile asms could clobber any register they like. But the register allocator assumes that registers are preserved across volatile asms unless explicitly clobbered. And AFAIK it always has. So in the above example we get: addl $1, %edi movl %edi, (%rsi) #APP # 4 "/tmp/foo.c" 1 # asm # 0 "" 2 #NO_APP movl %edi, 4(%rsi) ret with %edi being live across the asm. We do nothing this draconian for a normal function call, which could easily use a volatile asm internally. IMO anything that isn't flushed for a call shouldn't be flushed for a volatile asm either. One of the big grey areas is what should happen for floating-point ops that depend on the current rounding mode. That isn't really modelled properly yet though. Again, it affects calls as well as volatile asms. Thanks, Richard