On Fri, Oct 14, 2011 at 6:53 AM, Julian Brown <jul...@codesourcery.com> wrote: > On Wed, 28 Sep 2011 14:33:17 +0100 > Ramana Radhakrishnan <ramana.radhakrish...@linaro.org> wrote: > >> On 6 May 2011 14:13, Julian Brown <jul...@codesourcery.com> wrote: >> > Hi, >> > >> > This is the second of two patches to add unaligned-access support to >> > the ARM backend. It builds on the first patch to provide support for >> > unaligned accesses when expanding block moves (i.e. for builtin >> > memcpy operations). It makes some effort to use load/store multiple >> > instructions where appropriate (when accessing sufficiently-aligned >> > source or destination addresses), and also makes some effort to >> > generate fast code (for -O1/2/3) or small code (for -Os), though >> > some of the heuristics may need tweaking still >> >> Sorry it's taken me a while to get around to this one. Do you know >> what difference this makes to performance on some standard benchmarks >> on let's say an A9 and an M4 as I see that this gets triggered only >> when we have less than 64 bytes to copy. ? > > No, sorry, I don't have any benchmark results available at present. I > think we'd have to have terrifically bad luck for it to be a > performance degradation, though...
I've backported the unaligned struct and memcpy patches to our 4.6 based compilers and benchmarked them. The worst is a 0.84 % drop in performance, the best a 7.17 % improvement, and a geomean of 0.18 %. This was in a Cortex-A9 NEON -O3 configuration. The results are accurate to less than 0.1 %. I'll send you and Ramana the raw results privately. -- Michael