bearophile wrote:
Don:
But still, cache effects are more important than instruction scheduling
in 99% of cases.
I agree.
CPUs have prefetching instructions, but D doesn't expose them as intrinsics. A
bit more higher level visibility for those instructions may be positive today.
A problem with that, is that the prefetching instructions are
vendor-specific. Also, it's quite difficult to use them correctly. If
you put them in the wrong place, or use them too much, they slow your
code down.
Being D a system language, another possible idea is to partially unveil what's under the
"array as a random access memory" illusion. Memory hierarchy makes array access
times quite variable according to what level of the memory pyramid your data is stored
into (http://dotnetperls.com/memory-hierarchy ). This is why numeric algorithms that work
on large arrays enjoy tiling a lot now. The Chapel language has language-level support
for a high level specification of tilings, while Fortran compilers perform some limited
forms of tiling by themselves.
I think it is impossible to be a modern systems language without some
support for memory heirarchy.
I think we'll be able to take advantage of D's awesome metaprogramming,
to support cache-aware algorithms. As a first step, I added cache size
determination to core.cpuid some time ago. We have a long way to go, still.