On Mon, Sep 19, 2016 at 11:22:27AM +0200, Jan Hubicka wrote: > > On Mon, Sep 19, 2016 at 2:48 AM, Jan Hubicka <hubi...@ucw.cz> wrote: > > > Hi, > > > this is the patch compensating testsuite I commited after re-testing > > > on x86_64-linux. > > > > > > Other placements of early_thread_jumps does not work veyr well (at least > > > in > > > current implementation). Putting it before forwprop disables about 15% of > > > threadings. Placing it after DCE makes inliner to not see much of benefits > > > because threading requires a cleanup propagation+DCE after itself. > > > So unless we extend threader to be smarter or add extra DCE cleanup, i > > > think > > > this is best placement. > > > > > > This caused (another) 3-4% degradation in coremarks on ThunderX. > > Hmm, this is interesting. The patch should have "fixed" the previous > degradation by making the profile correct (backward threader still doe not > update it, but because most threading now happens early and profile is built > afterwards this should be less of issue). I am now looking into the profile > update issues and will try to check why coremarks degrade again. But, the early threader is running with speed_p set to false (second parameter to find_jump_threads_backwards)
unsigned int pass_early_thread_jumps::execute (function *fun) { /* Try to thread each block with more than one successor. */ basic_block bb; FOR_EACH_BB_FN (bb, fun) { if (EDGE_COUNT (bb->succs) > 1) find_jump_threads_backwards (bb, false); } thread_through_all_blocks (true); return 0; } So even though profile information is ignored, we think we are compiling for size and won't thread. The relevant check in profitable_jump_thread_path is: if (speed_p && optimize_edge_for_speed_p (taken_edge)) { <snip> } else if (n_insns > 1) { if (dump_file && (dump_flags & TDF_DETAILS)) fprintf (dump_file, "FSM jump-thread path not considered: " "duplication of %i insns is needed and optimizing for size.\n", n_insns); path->pop (); return NULL; } Changing false to true in the above hunk looks like it enables some of the threading we're relying on here. Thanks, James