On Wed, Jun 12, 2019 at 11:00:43AM +0200, Richard Biener wrote:
> > Bootstrapped/regtested on x86_64-linux and i686-linux and furthermore tested
> > with x86_64-linux to nvptx-none offloading with the nvptx.c PR90811 fix
> > reverted to verify the overaligned variables are gone.  Ok for trunk?
> 
> Isn't there hunk missing that actually passes false?

It is that
+    add_stack_var (origvar, really_expand);
expand_one_var is called 3 times:
      expand_one_var (t, toplevel, true);
      size += expand_one_var (var, true, false);
        expand_one_var (var, true, true);
where the second one is the estimated_stack_frame_size call where it passes
down false as really_expand.  And the patch passes it down through to
align_local_variable.

> I guess only CCPs bit-value/alignment tracking sees the difference.
> 
> Note we are already aligning variables in stor-layout.c with target
> information so in some way this isn't a real fix.  Offloading as
> implemented right now really leaks target dependent decisions from
> the target to the offload target.

I was thinking about only honoring DECL_ALIGN for non-FIELD_DECLs when
streaming in ACCEL_COMPILER only if DECL_USER_ALIGN, the risk is if
we bump DECL_ALIGN somewhere pre-IPA on the host side and some optimization
actually optimizes code using that DECL_ALIGN, later on we stream offloading
out and in, downgrade DECL_ALIGN and suddenly the optimization was invalid.

> Not opposed to the change though a comment in the align_local_variable
> hunk as to why we only adjust DECL_ALIGN late would be appreciated.

Sure, that can be done:

> > --- gcc/cfgexpand.c.jj      2019-05-20 11:39:34.059816432 +0200
> > +++ gcc/cfgexpand.c 2019-06-11 11:28:08.720932421 +0200
> > @@ -361,7 +361,7 @@ static bool has_short_buffer;
> >     we can't do with expected alignment of the stack boundary.  */
> >  
> >  static unsigned int
> > -align_local_variable (tree decl)
> > +align_local_variable (tree decl, bool really_expand)
> >  {
> >    unsigned int align;
> >  
> > @@ -370,7 +370,8 @@ align_local_variable (tree decl)
> >    else
> >      {
> >        align = LOCAL_DECL_ALIGNMENT (decl);
> > -      SET_DECL_ALIGN (decl, align);

+      /* Don't change DECL_ALIGN when called from estimated_stack_frame_size.
+        That is done before IPA and could bump alignment based on host
+        backend even for offloaded code which wants different
+        LOCAL_DECL_ALIGNMENT.  */

> > +      if (really_expand)
> > +   SET_DECL_ALIGN (decl, align);
> >      }
> >    return align / BITS_PER_UNIT;
> >  }

        Jakub

Reply via email to