Don wrote:
The next D2 runtime will include my cache-size detection code. This makes it possible to write a cache-aware memcpy, using (for example) non-temporal writes when the arrays being copied exceed the size of the largest cache.
In my tests, it gives a speed-up of approximately 2X in such cases.
The downside is, it's a fair bit of work to implement, and it only affects extremely large arrays, so I fear it's basically irrelevant (It probably won't help arrays < 32K in size). Do people actually copy megabyte-sized arrays?
Is it worth spending any more time on it?

I don't use large arrays very often. When I do, I would not copy them if I could avoid it. Usually, either I keep catenating to an array until a certain point, then I only ever need to read from it, with no copying ever necessary. So I would rarely, if ever, benefit from this.

Reply via email to