On Sat, Aug 14, 2010 at 10:50 AM, orthochronous <[email protected]>wrote:
> Additionally the > NEON unit on ARM uses only the L2 cache, requiring explicitly making > the L1 cache coherent with L2 before accessing any of the data in the > main part of the CPU... > Good to know that ARM is consistent at *something*. They're developing deep expertise at fucking up concurrency. Weak memory consistency, weak cache coherency, and (until recently) a non-antialiased virtual cache. Could somebody *please* form a chip company that has *architects* on staff! > This is a reasonable design for multimedia, where most of the time the > scalar and SIMD data-sets are don't overlap.. > No, it isn't, but that's a long discussion. What you're describing is used to be known as software scheduled [V]LIW. The ideas go back to the Yale Bulldog machine, and without exception, every attempt to make them perform well from compiler-generated code has failed for reasons that, in hindsight, were completely inevitable. The biggest of those were irregularities of the ISA w.r.t. pipeline function completion or memory coherency. shap
_______________________________________________ bitc-dev mailing list [email protected] http://www.coyotos.org/mailman/listinfo/bitc-dev
