Re: set the correct block info for the call stmt in fnsplit (issue6111050)
We do not depend on the block structure any more when dealing with stack layout for variables in GCC 4.7.0 and above. I am not saying your patch is incorrect or not needed. Just it will not have an effect on variable stack layout. It might be worth backporting to the 4.6 branch though, these stack layout issues are very nasty. -- Eric Botcazou
Re: set the correct block info for the call stmt in fnsplit (issue6111050)
On Fri, Apr 27, 2012 at 12:50 PM, Eric Botcazou ebotca...@adacore.com wrote: We do not depend on the block structure any more when dealing with stack layout for variables in GCC 4.7.0 and above. I am not saying your patch is incorrect or not needed. Just it will not have an effect on variable stack layout. It might be worth backporting to the 4.6 branch though, these stack layout issues are very nasty. What about not setting a BLOCK on the call stmt? That said, I can't see how outlining a SESE region that starts at the beginning of the function is not conservatively using the outermost BLOCK ... so, is the bug not elsewhere? Richard. -- Eric Botcazou
Re: set the correct block info for the call stmt in fnsplit (issue6111050)
On Fri, Apr 27, 2012 at 12:50 PM, Eric Botcazou ebotca...@adacore.com wrote: We do not depend on the block structure any more when dealing with stack layout for variables in GCC 4.7.0 and above. I am not saying your patch is incorrect or not needed. Just it will not have an effect on variable stack layout. It might be worth backporting to the 4.6 branch though, these stack layout issues are very nasty. What about not setting a BLOCK on the call stmt? That said, I can't see I recall that not seetting the block did lead to ICE... how outlining a SESE region that starts at the beginning of the function is not conservatively using the outermost BLOCK ... so, is the bug not elsewhere? ... so I made the exactly same conclussion. The problem is with stack vars allocated in header that survive till the tail part? Honza Richard. -- Eric Botcazou
Re: set the correct block info for the call stmt in fnsplit (issue6111050)
On Fri, Apr 27, 2012 at 5:06 AM, Jan Hubicka hubi...@ucw.cz wrote: On Fri, Apr 27, 2012 at 12:50 PM, Eric Botcazou ebotca...@adacore.com wrote: We do not depend on the block structure any more when dealing with stack layout for variables in GCC 4.7.0 and above. I am not saying your patch is incorrect or not needed. Just it will not have an effect on variable stack layout. It might be worth backporting to the 4.6 branch though, these stack layout issues are very nasty. What about not setting a BLOCK on the call stmt? That said, I can't see I recall that not seetting the block did lead to ICE... how outlining a SESE region that starts at the beginning of the function is not conservatively using the outermost BLOCK ... so, is the bug not elsewhere? ... so I made the exactly same conclussion. The problem is with stack vars allocated in header that survive till the tail part? A SESE region does not necessary at the beginning of the function. They can be anywhere. In the example I attached, it is at the end of function. Even if the outlined region is at the beginning the function. setting the call_stmt as the outermost block is also incorrect. For c++ programs, the block for function level local variables is not DECL_INITIAL(current_function_decl). It is the subblock of DECL_INITIAL(current_function_decl). This is different from c programs. -Rong Honza Richard. -- Eric Botcazou
set the correct block info for the call stmt in fnsplit (issue6111050)
Hi, In split_function() (ipa-split.c), for the newly created call stmt, its block is set to the outermost scope, i.e. DECL_INITIAL(current_function_decl). When we inline this partially outlined function, we create the new block based on the block for the call stmt (in expand_call_inline()). So the new block for the inlined body is in parallel to the function top level scope. This bad block structure will late result in a bad stack layout. This patch fixes the issue by setting the correct block number. It starts with the split_point entry bb to find the block stmt in the outlined region. The entry_bb maybe an empty BB so we need to follow the CFG until we find a non-empty bb. My early patch tried to use the block info from the first non-empty bb in the outlined regision. But that failed bootstrap as some of the stmts (assignment stmt) do not have the block info (bug?). So in this patch I traverse all the stmts in the bb until getting the block info. Tested with gcc bootstap. Thanks, 2012-04-24 Rong Xu x...@google.com * ipa-split.c (split_function): set the correct block for the call statement. Index: ipa-split.c === --- ipa-split.c (revision 186634) +++ ipa-split.c (working copy) @@ -948,7 +948,7 @@ int num = 0; struct cgraph_node *node, *cur_node = cgraph_node (current_function_decl); basic_block return_bb = find_return_bb (); - basic_block call_bb; + basic_block call_bb, bb; gimple_stmt_iterator gsi; gimple call; edge e; @@ -958,6 +958,7 @@ gimple last_stmt = NULL; unsigned int i; tree arg; + tree split_loc_block = NULL; if (dump_file) { @@ -1072,6 +1073,33 @@ } } + /* Find the block location of the split region. */ + bb = split_point-entry_bb; + do +{ + bool found = false; + + for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (gsi)) +{ + if (is_gimple_debug(gsi_stmt(gsi))) +continue; + if ((split_loc_block = gimple_block (gsi_stmt (gsi +{ + found = true; + break; +} +} + if (found) +break; + + /* If we reach here, this bb should be an empty unconditional + or fall-throuth branch. We continue with the succ bb. */ + bb = single_succ (bb); +} + while (bb bitmap_bit_p (split_point-split_bbs, bb-index)); + + gcc_assert (split_loc_block); + /* Now create the actual clone. */ rebuild_cgraph_edges (); node = cgraph_function_versioning (cur_node, NULL, NULL, args_to_skip, @@ -1115,7 +1143,7 @@ VEC_replace (tree, args_to_pass, i, arg); } call = gimple_build_call_vec (node-decl, args_to_pass); - gimple_set_block (call, DECL_INITIAL (current_function_decl)); + gimple_set_block (call, split_loc_block); /* We avoid address being taken on any variable used by split part, so return slot optimization is always possible. Moreover this is -- This patch is available for review at http://codereview.appspot.com/6111050
Re: set the correct block info for the call stmt in fnsplit (issue6111050)
On Tue, Apr 24, 2012 at 2:57 PM, Rong Xu x...@google.com wrote: Hi, In split_function() (ipa-split.c), for the newly created call stmt, its block is set to the outermost scope, i.e. DECL_INITIAL(current_function_decl). When we inline this partially outlined function, we create the new block based on the block for the call stmt (in expand_call_inline()). So the new block for the inlined body is in parallel to the function top level scope. This bad block structure will late result in a bad stack layout. We do not depend on the block structure any more when dealing with stack layout for variables in GCC 4.7.0 and above. I am not saying your patch is incorrect or not needed. Just it will not have an effect on variable stack layout. Did you have a testcase for the bad stack layout issue? Or was it too hard to produce one because the size matters? This patch fixes the issue by setting the correct block number. It starts with the split_point entry bb to find the block stmt in the outlined region. The entry_bb maybe an empty BB so we need to follow the CFG until we find a non-empty bb. My early patch tried to use the block info from the first non-empty bb in the outlined regision. But that failed bootstrap as some of the stmts (assignment stmt) do not have the block info (bug?). So in this patch I traverse all the stmts in the bb until getting the block info. Tested with gcc bootstap. On which target? Thanks, Andrew Pinski Thanks, 2012-04-24 Rong Xu x...@google.com * ipa-split.c (split_function): set the correct block for the call statement. Index: ipa-split.c === --- ipa-split.c (revision 186634) +++ ipa-split.c (working copy) @@ -948,7 +948,7 @@ int num = 0; struct cgraph_node *node, *cur_node = cgraph_node (current_function_decl); basic_block return_bb = find_return_bb (); - basic_block call_bb; + basic_block call_bb, bb; gimple_stmt_iterator gsi; gimple call; edge e; @@ -958,6 +958,7 @@ gimple last_stmt = NULL; unsigned int i; tree arg; + tree split_loc_block = NULL; if (dump_file) { @@ -1072,6 +1073,33 @@ } } + /* Find the block location of the split region. */ + bb = split_point-entry_bb; + do + { + bool found = false; + + for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (gsi)) + { + if (is_gimple_debug(gsi_stmt(gsi))) + continue; + if ((split_loc_block = gimple_block (gsi_stmt (gsi + { + found = true; + break; + } + } + if (found) + break; + + /* If we reach here, this bb should be an empty unconditional + or fall-throuth branch. We continue with the succ bb. */ + bb = single_succ (bb); + } + while (bb bitmap_bit_p (split_point-split_bbs, bb-index)); + + gcc_assert (split_loc_block); + /* Now create the actual clone. */ rebuild_cgraph_edges (); node = cgraph_function_versioning (cur_node, NULL, NULL, args_to_skip, @@ -1115,7 +1143,7 @@ VEC_replace (tree, args_to_pass, i, arg); } call = gimple_build_call_vec (node-decl, args_to_pass); - gimple_set_block (call, DECL_INITIAL (current_function_decl)); + gimple_set_block (call, split_loc_block); /* We avoid address being taken on any variable used by split part, so return slot optimization is always possible. Moreover this is -- This patch is available for review at http://codereview.appspot.com/6111050
Re: set the correct block info for the call stmt in fnsplit (issue6111050)
please the find test case in the attachment. It shows the issue in google-4_6 branch -c -O2 -fno-strict-aliasing ss.C -fdump-rtl-expand-all in the rtl-expand dump, trianglevertices and one the gtest_ar are in the same partition. the problem is found in arm compiler, but we manager to reproduce in x86. -Rong On Tue, Apr 24, 2012 at 3:02 PM, Andrew Pinski pins...@gmail.com wrote: On Tue, Apr 24, 2012 at 2:57 PM, Rong Xu x...@google.com wrote: Hi, In split_function() (ipa-split.c), for the newly created call stmt, its block is set to the outermost scope, i.e. DECL_INITIAL(current_function_decl). When we inline this partially outlined function, we create the new block based on the block for the call stmt (in expand_call_inline()). So the new block for the inlined body is in parallel to the function top level scope. This bad block structure will late result in a bad stack layout. We do not depend on the block structure any more when dealing with stack layout for variables in GCC 4.7.0 and above. I am not saying your patch is incorrect or not needed. Just it will not have an effect on variable stack layout. Did you have a testcase for the bad stack layout issue? Or was it too hard to produce one because the size matters? This patch fixes the issue by setting the correct block number. It starts with the split_point entry bb to find the block stmt in the outlined region. The entry_bb maybe an empty BB so we need to follow the CFG until we find a non-empty bb. My early patch tried to use the block info from the first non-empty bb in the outlined regision. But that failed bootstrap as some of the stmts (assignment stmt) do not have the block info (bug?). So in this patch I traverse all the stmts in the bb until getting the block info. Tested with gcc bootstap. On which target? Thanks, Andrew Pinski Thanks, 2012-04-24 Rong Xu x...@google.com * ipa-split.c (split_function): set the correct block for the call statement. Index: ipa-split.c === --- ipa-split.c (revision 186634) +++ ipa-split.c (working copy) @@ -948,7 +948,7 @@ int num = 0; struct cgraph_node *node, *cur_node = cgraph_node (current_function_decl); basic_block return_bb = find_return_bb (); - basic_block call_bb; + basic_block call_bb, bb; gimple_stmt_iterator gsi; gimple call; edge e; @@ -958,6 +958,7 @@ gimple last_stmt = NULL; unsigned int i; tree arg; + tree split_loc_block = NULL; if (dump_file) { @@ -1072,6 +1073,33 @@ } } + /* Find the block location of the split region. */ + bb = split_point-entry_bb; + do + { + bool found = false; + + for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (gsi)) + { + if (is_gimple_debug(gsi_stmt(gsi))) + continue; + if ((split_loc_block = gimple_block (gsi_stmt (gsi + { + found = true; + break; + } + } + if (found) + break; + + /* If we reach here, this bb should be an empty unconditional + or fall-throuth branch. We continue with the succ bb. */ + bb = single_succ (bb); + } + while (bb bitmap_bit_p (split_point-split_bbs, bb-index)); + + gcc_assert (split_loc_block); + /* Now create the actual clone. */ rebuild_cgraph_edges (); node = cgraph_function_versioning (cur_node, NULL, NULL, args_to_skip, @@ -1115,7 +1143,7 @@ VEC_replace (tree, args_to_pass, i, arg); } call = gimple_build_call_vec (node-decl, args_to_pass); - gimple_set_block (call, DECL_INITIAL (current_function_decl)); + gimple_set_block (call, split_loc_block); /* We avoid address being taken on any variable used by split part, so return slot optimization is always possible. Moreover this is -- This patch is available for review at http://codereview.appspot.com/6111050 namespace testing { class Test { virtual void TestBody() = 0; }; class AssertionResult { public: AssertionResult(const AssertionResult other); operator bool() const { return success_; } private: bool success_; void operator=(AssertionResult const ); }; namespace internal { class TestFactoryBase { public: virtual ::testing::Test* CreateTest() = 0; }; template class TestClass class TestFactoryImpl : public TestFactoryBase { public: virtual ::testing::Test* CreateTest() { return new TestClass; } }; char* MakeAndRegisterTestInfo2( const char* test_case_name, TestFactoryBase* factory); AssertionResult CmpHelperEQ(const char* expected_expression, const char* actual_expression, long long expected, long long actual); AssertionResult CmpHelperNE( const char* expr, const char* expr2, long long val1, long long val2);