http://llvm.org/bugs/show_bug.cgi?id=3120
Bob Wilson <[email protected]> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |FIXED --- Comment #9 from Bob Wilson <[email protected]> 2009-11-25 11:46:01 --- svn r89865 enables tail duplication for indirect branches on x86, and with that change, the performance is much better*: 32-bit: Switched interpreter 240 Threaded interpreter 144 Recursive interpreter 821 Closure-based interpreter 1106 64-bit: Switched interpreter 584 Threaded interpreter 136 Recursive interpreter 733 Closure-based interpreter 508 * The 64-bit "switched interpreter" result here is bad because of a code alignment problem. The function is 16-byte aligned but the performance suffers badly if it is not placed on a 32-byte aligned boundary. I think this may be due to the branch predictor fetching 32-byte aligned data. If I manually increase the alignment of that function, the runtime drops from 584 to 291, which is better than gcc's 335 but still not as good as the 240 I measured before. That manual alignment change also increases the "threaded interpreter" runtime slightly from 136 to 144. I will file a separate PR for the alignment issue. -- Configure bugmail: http://llvm.org/bugs/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug. _______________________________________________ LLVMbugs mailing list [email protected] http://lists.cs.uiuc.edu/mailman/listinfo/llvmbugs
