On Sat, Aug 14, 2010 at 10:50 AM, orthochronous <[email protected]>wrote:

> Additionally the
> NEON unit on ARM uses only the L2 cache, requiring explicitly making
> the L1 cache coherent with L2 before accessing any of the data in the
> main part of the CPU...
>

Good to know that ARM is consistent at *something*. They're developing deep
expertise at fucking up concurrency. Weak memory consistency, weak cache
coherency, and (until recently) a non-antialiased virtual cache.

Could somebody *please* form a chip company that has *architects* on staff!


> This is a reasonable design for multimedia, where most of the time the
> scalar and SIMD data-sets are don't overlap..
>

No, it isn't, but that's a long discussion. What you're describing is used
to be known as software scheduled [V]LIW. The ideas go back to the Yale
Bulldog machine, and without exception, every attempt to make them perform
well from compiler-generated code has failed for reasons that, in hindsight,
were completely inevitable. The biggest of those were irregularities of the
ISA w.r.t. pipeline function completion or memory coherency.

shap
_______________________________________________
bitc-dev mailing list
[email protected]
http://www.coyotos.org/mailman/listinfo/bitc-dev

Reply via email to