On Wed, Jun 12, 2019 at 11:00:43AM +0200, Richard Biener wrote: > > Bootstrapped/regtested on x86_64-linux and i686-linux and furthermore tested > > with x86_64-linux to nvptx-none offloading with the nvptx.c PR90811 fix > > reverted to verify the overaligned variables are gone. Ok for trunk? > > Isn't there hunk missing that actually passes false?
It is that + add_stack_var (origvar, really_expand); expand_one_var is called 3 times: expand_one_var (t, toplevel, true); size += expand_one_var (var, true, false); expand_one_var (var, true, true); where the second one is the estimated_stack_frame_size call where it passes down false as really_expand. And the patch passes it down through to align_local_variable. > I guess only CCPs bit-value/alignment tracking sees the difference. > > Note we are already aligning variables in stor-layout.c with target > information so in some way this isn't a real fix. Offloading as > implemented right now really leaks target dependent decisions from > the target to the offload target. I was thinking about only honoring DECL_ALIGN for non-FIELD_DECLs when streaming in ACCEL_COMPILER only if DECL_USER_ALIGN, the risk is if we bump DECL_ALIGN somewhere pre-IPA on the host side and some optimization actually optimizes code using that DECL_ALIGN, later on we stream offloading out and in, downgrade DECL_ALIGN and suddenly the optimization was invalid. > Not opposed to the change though a comment in the align_local_variable > hunk as to why we only adjust DECL_ALIGN late would be appreciated. Sure, that can be done: > > --- gcc/cfgexpand.c.jj 2019-05-20 11:39:34.059816432 +0200 > > +++ gcc/cfgexpand.c 2019-06-11 11:28:08.720932421 +0200 > > @@ -361,7 +361,7 @@ static bool has_short_buffer; > > we can't do with expected alignment of the stack boundary. */ > > > > static unsigned int > > -align_local_variable (tree decl) > > +align_local_variable (tree decl, bool really_expand) > > { > > unsigned int align; > > > > @@ -370,7 +370,8 @@ align_local_variable (tree decl) > > else > > { > > align = LOCAL_DECL_ALIGNMENT (decl); > > - SET_DECL_ALIGN (decl, align); + /* Don't change DECL_ALIGN when called from estimated_stack_frame_size. + That is done before IPA and could bump alignment based on host + backend even for offloaded code which wants different + LOCAL_DECL_ALIGNMENT. */ > > + if (really_expand) > > + SET_DECL_ALIGN (decl, align); > > } > > return align / BITS_PER_UNIT; > > } Jakub