On Tue, Apr 24, 2012 at 2:57 PM, Rong Xu <x...@google.com> wrote: > Hi, > > In split_function() (ipa-split.c), for the newly created call stmt, > its block is set to the outermost scope, i.e. > DECL_INITIAL(current_function_decl). When we inline this > partially outlined function, we create the new block based on the > block for the call stmt (in expand_call_inline()). > So the new block for the inlined body is in > parallel to the function top level scope. This bad block structure will > late result in a bad stack layout.
We do not depend on the block structure any more when dealing with stack layout for variables in GCC 4.7.0 and above. I am not saying your patch is incorrect or not needed. Just it will not have an effect on variable stack layout. Did you have a testcase for the bad stack layout issue? Or was it too hard to produce one because the size matters? > > This patch fixes the issue by setting the correct block number. It > starts with the split_point entry bb to find the block stmt in the > outlined region. The entry_bb maybe an empty BB so we need to follow > the CFG until we find a non-empty bb. > > My early patch tried to use the block info from the first non-empty > bb in the outlined regision. But that failed bootstrap as some of the > stmts (assignment stmt) do not have the block info (bug?). So in this > patch I traverse all the stmts in the bb until getting the block info. > > Tested with gcc bootstap. On which target? Thanks, Andrew Pinski > > Thanks, > > 2012-04-24 Rong Xu <x...@google.com> > > * ipa-split.c (split_function): set the correct block for the > call statement. > > Index: ipa-split.c > =================================================================== > --- ipa-split.c (revision 186634) > +++ ipa-split.c (working copy) > @@ -948,7 +948,7 @@ > int num = 0; > struct cgraph_node *node, *cur_node = cgraph_node (current_function_decl); > basic_block return_bb = find_return_bb (); > - basic_block call_bb; > + basic_block call_bb, bb; > gimple_stmt_iterator gsi; > gimple call; > edge e; > @@ -958,6 +958,7 @@ > gimple last_stmt = NULL; > unsigned int i; > tree arg; > + tree split_loc_block = NULL; > > if (dump_file) > { > @@ -1072,6 +1073,33 @@ > } > } > > + /* Find the block location of the split region. */ > + bb = split_point->entry_bb; > + do > + { > + bool found = false; > + > + for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi)) > + { > + if (is_gimple_debug(gsi_stmt(gsi))) > + continue; > + if ((split_loc_block = gimple_block (gsi_stmt (gsi)))) > + { > + found = true; > + break; > + } > + } > + if (found) > + break; > + > + /* If we reach here, this bb should be an empty unconditional > + or fall-throuth branch. We continue with the succ bb. */ > + bb = single_succ (bb); > + } > + while (bb && bitmap_bit_p (split_point->split_bbs, bb->index)); > + > + gcc_assert (split_loc_block); > + > /* Now create the actual clone. */ > rebuild_cgraph_edges (); > node = cgraph_function_versioning (cur_node, NULL, NULL, args_to_skip, > @@ -1115,7 +1143,7 @@ > VEC_replace (tree, args_to_pass, i, arg); > } > call = gimple_build_call_vec (node->decl, args_to_pass); > - gimple_set_block (call, DECL_INITIAL (current_function_decl)); > + gimple_set_block (call, split_loc_block); > > /* We avoid address being taken on any variable used by split part, > so return slot optimization is always possible. Moreover this is > > -- > This patch is available for review at http://codereview.appspot.com/6111050