On Fri, Oct 14, 2011 at 6:53 AM, Julian Brown <jul...@codesourcery.com> wrote:
> On Wed, 28 Sep 2011 14:33:17 +0100
> Ramana Radhakrishnan <ramana.radhakrish...@linaro.org> wrote:
>
>> On 6 May 2011 14:13, Julian Brown <jul...@codesourcery.com> wrote:
>> > Hi,
>> >
>> > This is the second of two patches to add unaligned-access support to
>> > the ARM backend. It builds on the first patch to provide support for
>> > unaligned accesses when expanding block moves (i.e. for builtin
>> > memcpy operations). It makes some effort to use load/store multiple
>> > instructions where appropriate (when accessing sufficiently-aligned
>> > source or destination addresses), and also makes some effort to
>> > generate fast code (for -O1/2/3) or small code (for -Os), though
>> > some of the heuristics may need tweaking still
>>
>> Sorry it's taken me a while to get around to this one. Do you know
>> what difference this makes to performance on some standard benchmarks
>> on let's say an A9 and an M4 as I see that this gets triggered only
>> when we have less than 64 bytes to copy. ?
>
> No, sorry, I don't have any benchmark results available at present. I
> think we'd have to have terrifically bad luck for it to be a
> performance degradation, though...

I've backported the unaligned struct and memcpy patches to our 4.6
based compilers and benchmarked them.  The worst is a 0.84 % drop in
performance, the best a 7.17 % improvement, and a geomean of 0.18 %.
This was in a Cortex-A9 NEON -O3 configuration.  The results are
accurate to less than 0.1 %.

I'll send you and Ramana the raw results privately.

-- Michael

Reply via email to