Modern x86 CPUs handle floats at about twice the speed as doubles. A floating-point instruction usually takes one cycle, and each instruction can execute multiple operations due to vectorization. With doubles, you can have 4 operations per instruction, and with floats, you can have 8 operations per instruction. The L1 cache bandwidth is nicely adjusted to this, so that both CPU speed and L1 cache bandwidth peak out at the same throughput.
-erik On Wed, Oct 15, 2014 at 4:51 AM, Tamas Papp <[email protected]> wrote: > Just out of curiosity, can you please post some benchmarks of Float32 vs > Float in Julia for your algorithm when you finish what you are working > on? > > My experience on modern x86 architectures is that CPU handles both at > the approximately same speeds, and when I have big matrices the speed > benefit of single float comes from using less memory, but that is not > worth dealing with the subtle problems that come from loss of precision > (in particular, it is very easy to run into conditioning problems with > single float). > > Best, > > Tamas > > > On Wed, Oct 15 2014, Hubert Soyer <[email protected]> wrote: > >> Sorry for coming back to this so late. I forgot to subscribe to get updates >> via email and thought nobody had replied yet. >> Thank you for all the comments, I think I get the point. >> >> I am coming from a python background and was and am still working with a >> module called Theano that provides automatic differentiation and is used a >> lot for neural networks. >> This module offers an environment variable (floatX) that lets me specify >> whether I want to use Float64 or Float32 all the way. >> >> My workflow would then be: >> Prototype with Float32 to get the speed benefit. >> When I want to have a more serious look at my results, I switch to Float64 >> to be safe. >> >> So I thought I'd ask this question and if it turns out that there is a >> switch like that in Julia, I could just use it. >> I do understand that this was just a convenient "hack" for me and it will >> definitely work without that functionality. >> But in case it existed but just wasn't documented, I thought I'd ask. >> >> Thank you a lot for your comments, Pontus' solution seems to cover my use >> case just fine, I think I'll go with that. >> >> Best, >> >> Hubert > -- Erik Schnetter <[email protected]> http://www.perimeterinstitute.ca/personal/eschnetter/
