Am Fri, 15 Apr 2016 18:54:12 +0000 schrieb jmh530 <john.michael.h...@gmail.com>:
> On Tuesday, 12 April 2016 at 10:55:18 UTC, xenon325 wrote: > > > > Have you seen how GCC's function multiversioning [1] ? > > > > I've been thinking about the gcc multiversioning since you > mentioned it previously. > > I keep thinking about how the optimal algorithm for something > like matrix multiplication depends on the size of the matrices. > > For instance, you might do something for very small matrices that > just relies on one processor, then you add in SIMD as the size > grows, then you add in multiple CPUs, then you add in the GPU (or > maybe you add before CPUs), then you add in multiple computers. GCC only has one architecture as a target at a time. As long as this is so, there is little point in contemplating how it handles multiple architectures and network traffic. :) CPUs run the bulk of code, from booting over kernel and drivers to applications and there will always be something that can be optimized if it is statically known that a certain instruction set is supported. To pick up your matrices example, imagine OpenGL code that has some 4x4 matrices that are in no direct relation to each other. The GPU is only good at bulk processing, and it doesn't apply here. So you need the general purpose processor and benefit from the knowledge that some SSE level is supported. In general, when you have to make many quick decisions on small amounts of data the GPU or networking are out of question. -- Marco