Re: [fpc-devel] Patch, font rendering on Arm-Linux devices.

Yury Sidorov Thu, 28 Feb 2008 02:57:40 -0800

From: "Daniël Mantione" <[EMAIL PROTECTED]>

> On Thursday 28 February 2008 09:16, Daniël Mantione wrote:
>
>> Memory access. What happens is that the non-packed version causes
>> more cache misses.
>
> Please elaborate. If the (unaligned) data is crossing a> cache-line, thus> causing two full cache-line reads, I'd understand that, but once> it's
> in the cache, it wouldn't matter anymore?
Yes, but if you have an array of them (as we have in this case),
considerably more of these records will fit in the cache. Thereforeyouwill have considerably less cache misses. This becomes even moreseriouswhen the processor in question does not have prefetching; in suchcase,traversing the array will cause cache miss after cache miss, asmaller
array will then have less of these misses.

You are right. Array of packed records is a bit more effective thanarray of non-packed records, at least on modern x86 CPUs.


I do some benchmarks and got on Core Duo:
2070ms - for non-packed
1910ms - for packed

But for CPUs which do not support misaligned data access - packedrecords are speed killers and need to be used as the last resort.

Also if record is not element of large array it is better do declareit as non-packed for all CPUs.

Yury._______________________________________________

fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Re: [fpc-devel] Patch, font rendering on Arm-Linux devices.

Reply via email to