Richard Sandiford wrote: > Georg-Johann Lay <a...@gjlay.de> writes: >> This patch fixes PR51374 by more strictly updating mem_last_set. >> Sloppy handling of mem_last_set can lead to error in volatile correctness >> because combine then is allowed to drag one volatile access over an other >> volatile thing (volatile access, asm volatile, unspec_volatile...). >> >> As explained in the source comment, any volatile might change memory, even a >> volatile read. > > This looks too broad to me. Volatile MEMs don't usually alias > non-volatile ones (so for example, reads from volatile memory shouldn't > affect reads from non-volatile memory).
The patch tried to fix the issue. If there is less invasive solutions, that's always preferred, of course. > What do you think about instead changing: > > /* If there are any volatile insns between INSN and I3, reject, because > they might affect machine state. */ > > for (p = NEXT_INSN (insn); p != i3; p = NEXT_INSN (p)) > if (INSN_P (p) && p != succ && p != succ2 && volatile_insn_p (PATTERN > (p))) > return 0; > > to: > > /* If INSN contains volatile references (specifically volatile MEMs), > we cannot combine across any other volatile references. > Even if INSN doesn't contain volatile references, any intervening > volatile insn might affect machine state. */ > > pred = volatile_refs_p (insn) ? volatile_refs_p : volatile_insn_p; > for (p = NEXT_INSN (insn); p != i3; p = NEXT_INSN (p)) > if (NONDEBUG_INSN_P (p) && p != succ && p != succ2 && pred (PATTERN (p))) > return 0; This would still allow a volatile_refs_p to cross a volatile_insn_p? And some lines above there is this: /* If INSN contains anything volatile, or is an `asm' (whether volatile or not), reject, unless nothing volatile comes between it and I3 */ if (GET_CODE (src) == ASM_OPERANDS || volatile_refs_p (src)) { /* Make sure neither succ nor succ2 contains a volatile reference. */ if (succ2 != 0 && volatile_refs_p (PATTERN (succ2))) return 0; if (succ != 0 && volatile_refs_p (PATTERN (succ))) return 0; /* We'll check insns between INSN and I3 below. */ } Would our change (whatever it will look like) make that code superfluous? > As you say in the PR notes, we've already handled volatile_insn_p (insn) > by this stage, so the new loop is handling: > > volatile_refs_p (insn) && !volatile_insn_p (insn) There is this part of an if-clause in can_combine_p: /* Make sure that the value that is to be substituted for the register does not use any registers whose values alter in between. However, If the insns are adjacent, a use can't cross a set even though we think it might (this can happen for a sequence of insns each setting the same destination; last_set of that register might point to a NOTE). If INSN has a REG_EQUIV note, the register is always equivalent to the memory so the substitution is valid even if there are intervening stores. Also, don't move a volatile asm or UNSPEC_VOLATILE across any other insns. */ || (! all_adjacent && (((!MEM_P (src) || ! find_reg_note (insn, REG_EQUIV, src)) && use_crosses_set_p (src, DF_INSN_LUID (insn))) || (GET_CODE (src) == ASM_OPERANDS && MEM_VOLATILE_P (src)) || GET_CODE (src) == UNSPEC_VOLATILE)) so it's not a verbatim volatile_insn_p. Maybe it's for historical reason or for some subtle differences not explained; only someone familiar with combine will know... perhaps performance issue. >> Moreover, writes of the shape (set (zero_extract (mem ... update >> mem_last_set. > > Good catch. I think we should use note_stores for added safety though, > rather than call zero_extract out as a special case. (I realise the > function already handles subregs, but still.) record_dead_and_set_regs_1 is called through note_stores so that (set (zero_extract (mem)) is handled already and thus no change to record_dead_and_set_regs_1 is needed, right? Attached you find a new, tentative patch. It resolves the issue in my small test program. However, I think someone with more insight into combine should take over the patch. Johann PR rtl-optimization/51374 * combine.c (can_combine_p): Don't allow volatile_refs_p insns to cross other volatile_refs_p insns.
Index: combine.c =================================================================== --- combine.c (revision 183695) +++ combine.c (working copy) @@ -1947,12 +1947,21 @@ can_combine_p (rtx insn, rtx i3, rtx pre && REG_P (dest) && REGNO (dest) < FIRST_PSEUDO_REGISTER) return 0; - /* If there are any volatile insns between INSN and I3, reject, because - they might affect machine state. */ + /* If INSN contains volatile references (specifically volatile MEMs), + we cannot combine across any other volatile references. + Even if INSN doesn't contain volatile references, any intervening + volatile insn might affect machine state. */ + { + int (*predicat) (const_rtx); - for (p = NEXT_INSN (insn); p != i3; p = NEXT_INSN (p)) - if (INSN_P (p) && p != succ && p != succ2 && volatile_insn_p (PATTERN (p))) - return 0; + predicat = volatile_refs_p (PATTERN (insn)) + ? volatile_refs_p + : volatile_insn_p; + + for (p = NEXT_INSN (insn); p != i3; p = NEXT_INSN (p)) + if (INSN_P (p) && p != succ && p != succ2 && predicat (PATTERN (p))) + return 0; + } /* If INSN contains an autoincrement or autodecrement, make sure that register is not used between there and I3, and not already used in