On Wednesday, October 15, 2014 8:59:38 AM UTC-4, Erik Schnetter wrote: > > Modern x86 CPUs handle floats at about twice the speed as doubles. A > floating-point instruction usually takes one cycle, and each > instruction can execute multiple operations due to vectorization. With > doubles, you can have 4 operations per instruction, and with floats, > you can have 8 operations per instruction. >
That assumes that everything obtains optimal SIMD vectorization, which is usually false.
