I found that the node weight updates on cloned nodes during ipa-cp were leading to incorrect/insane weights. Both the original and new node weight computations used truncating divides, leading to a loss of total node weight. I have fixed this by making both rounding integer divides.
Bootstrapped and tested on x86-64-unknown-linux-gnu. Ok for trunk? 2013-03-27 Teresa Johnson <tejohn...@google.com> * ipa-cp.c (update_profiling_info): Perform rounding integer division when updating weights instead of truncating. (update_specialized_profile): Ditto. Index: ipa-cp.c =================================================================== --- ipa-cp.c (revision 197118) +++ ipa-cp.c (working copy) @@ -2588,14 +2588,18 @@ update_profiling_info (struct cgraph_node *orig_no for (cs = new_node->callees; cs ; cs = cs->next_callee) if (cs->frequency) - cs->count = cs->count * (new_sum * REG_BR_PROB_BASE - / orig_node_count) / REG_BR_PROB_BASE; + cs->count = (cs->count + * ((new_sum * REG_BR_PROB_BASE + orig_node_count/2) + / orig_node_count) + + REG_BR_PROB_BASE/2) / REG_BR_PROB_BASE; else cs->count = 0; for (cs = orig_node->callees; cs ; cs = cs->next_callee) - cs->count = cs->count * (remainder * REG_BR_PROB_BASE - / orig_node_count) / REG_BR_PROB_BASE; + cs->count = (cs->count + * ((remainder * REG_BR_PROB_BASE + orig_node_count/2) + / orig_node_count) + + REG_BR_PROB_BASE/2) / REG_BR_PROB_BASE; if (dump_file) dump_profile_updates (orig_node, new_node); @@ -2627,14 +2631,19 @@ update_specialized_profile (struct cgraph_node *ne for (cs = new_node->callees; cs ; cs = cs->next_callee) if (cs->frequency) - cs->count += cs->count * redirected_sum / new_node_count; + cs->count += (cs->count + * ((redirected_sum * REG_BR_PROB_BASE + + new_node_count/2) / new_node_count) + + REG_BR_PROB_BASE/2) / REG_BR_PROB_BASE; else cs->count = 0; for (cs = orig_node->callees; cs ; cs = cs->next_callee) { - gcov_type dec = cs->count * (redirected_sum * REG_BR_PROB_BASE - / orig_node_count) / REG_BR_PROB_BASE; + gcov_type dec = (cs->count + * ((redirected_sum * REG_BR_PROB_BASE + + orig_node_count/2) / orig_node_count) + + REG_BR_PROB_BASE/2) / REG_BR_PROB_BASE; if (dec < cs->count) cs->count -= dec; else -- This patch is available for review at http://codereview.appspot.com/7812053