On Tuesday, 14 April 2015 at 02:44:15 UTC, Laeeth Isharc wrote:
thanks for the links and colour, Walter and HST
But at the end of the day, the programmer has to know how to
write
cache-efficient code. No matter how the language/compiler
tries to be
smart and do the Right Thing(tm), poorly-laid out data is
poorly-laid
out data, and you're gonna incur cache misses all over the
place.
Cache-unfriendly algorithms are cache-unfriendly algorithms,
and no
smart language design / smart optimizer is gonna fix that for
you. You
have to know how to work with the modern cache hierarchies,
how to lay
out data for efficient access, and how to write cache-friendly
algorithms.
While Phobos is making good progress at being allocation-free,
it still
has a ways to go. And it doesn't help that the current D GC
isn't that
great, when you do have to allocate -- I've managed to get
30-40%
performance improvements just by turning off the default
collection
schedule and triggering collections myself at more strategic
intervals.
Would love to see an article sometime on efficient programming
in D - both cache efficiency and how to make the GC your
friend. (I get the basic idea of data driven design, but not
yet the subtleties of cache efficient code and I am sure many
other newcomers to D must be in a similar position).
I found the same thing as you describe with a monster CSV
import (files are daily, but data needs to be organized by
symbol to be useful).
Not having to box things is a big win IMO, though. Boxing of
POD types
in Java/C# just screams "inefficient" to me... can you imagine
all that
extra, needless indirection wreaking havoc on the CPU cache
and cache
predictions?
There was an interesting post on Lambda the ultimate by Mike
Pall (sp? The Lua guy) in which he said certain eyesight
DESIGN not eyesight. Ipad spell check.
decisions in Python meant much harder to ever make Python fast,
and one of the pypy guys agreed with him. (It was more than
just boxing).
I am not in favour of extrapolating trends mindlessly, but I
wonder what the world looks like In five or ten years should
the gap between processor perf and memory latency continue to
widen at similar rates given continued growth in data set sizes.
Laeeth.