https://gcc.gnu.org/bugzilla/show_bug.cgi?id=30409

--- Comment #8 from anlauf at gcc dot gnu.org ---
The suggested optimization needs to take into account that the evaluation
of the temporary expression might trap, or that allocatable variables are
not allocated, etc.

The trap etc. would not occur if the trip count of the loop is zero for the
non-hoisted variant, so we need to make sure not to generate failing code
for the hoisted one.

Similarly, for conditional code in the loop body, like

  if (cond) then
     expression1 (..., 1/y)
  else
     expression2 (..., 1/z)
  end if

where cond protects from traps even for finite trip counts, these conditions
may also need to be identified, and an appropriate block generated.

Some HPC compilers have directives (MOVE/NOMOVE) to annotate the respective
loops, and corresponding compiler options that are enabled only at aggressive
optimization levels for real-world code.

I wonder how much (or little) really needs to be done here, or if the task
can be split in a suitable way between FE and ME.

The tree-dump shows a __builtin_malloc/__builtin_free for the temporary
*within* the i-loop.  Would it be possible to move this *management* just
one loop level up, if the size of the temporary is known to be constant?
(Which is the case here).  I mean attach it to the outer scope?
Maybe the middle end then better "sees" what can reasonably be done?

Reply via email to