Hi, > Here's one more small performance patch for x86-64 fast trace: a slightly > lighter getcontext.
For completeness, perhaps I should mention that I also tested with ".p2align 2" and ".p2align 4" right before ".global _Ux86_64_getcontext_trace". The results started to be slightly sporadic, but curiously all the aligned versions were slightly but systematically slower than the unaligned one (by ~1-2%). The function is definitely unaligned with the patch, at offset 0x4e09 into the shared library in my case. I wonder if I started hitting cache collision type effects, and if this is beginning to be sensitive to the exact tests I am using. I'd be interested to hear what others see, provided anyone else cares in this much detail. Regards, Lassi _______________________________________________ Libunwind-devel mailing list [email protected] http://lists.nongnu.org/mailman/listinfo/libunwind-devel
