Hello bearophile,
So I think a structure closer to the reality will be the same with units that don't share a single memory and send messages to each other. But the units themselves will be composed by several cores that share a single memory (plus caches), and each core will have advanced SIMD instructions (see AVX instructions that perform a kind of vectorized 'if' too).
I'm no expert but that sounds a lot like what I know about current generation super-computers.
-- ... <IXOYE><
