Re: [Pixman] [PATCH 0/4] New fast paths and Raspberry Pi 1 benchmarking

Ben Avison Thu, 20 Aug 2015 13:32:54 -0700

On Thu, 20 Aug 2015 19:34:37 +0100, Bill Spitzak <spit...@gmail.com> wrote:

Could this be whether some "bad" instruction ends up next to or split
by a cache line boundary? That would produce a random-looking plot,
though it really is a plot of the location of the bad instructions in
the measured function.

If this really is a problem then the ideal fix is for the compiler to
insert NOP instructions in order to move the bad instructions away from
the locations that make them bad. Yike.


Thought of that, tried it, still baffled at the results. In other words,
merely ensuring instructions retained the same alignment to cachelines
wasn't enough to ensure reproducibility - it could only be achieved by
ensuring the same absolute address (which isn't an option with shared
libraries in the presence of ASLR).

My best theory at the moment is that the branch predictor in the ARM11
uses a hash of both the source and destination addresses of a branch to
choose which index in the predictor cache. Because it's a direct-mapped
cache, any collisions due to the branch moving to a different address can
have major effects on very tight loops like src_8888_8888.

Ben
_______________________________________________
Pixman mailing list
Pixman@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/pixman

Re: [Pixman] [PATCH 0/4] New fast paths and Raspberry Pi 1 benchmarking

Reply via email to