Modern x86 CPUs handle floats at about twice the speed as doubles. A
floating-point instruction usually takes one cycle, and each
instruction can execute multiple operations due to vectorization. With
doubles, you can have 4 operations per instruction, and with floats,
you can have 8 operations per instruction. The L1 cache bandwidth is
nicely adjusted to this, so that both CPU speed and L1 cache bandwidth
peak out at the same throughput.

-erik

On Wed, Oct 15, 2014 at 4:51 AM, Tamas Papp <[email protected]> wrote:
> Just out of curiosity, can you please post some benchmarks of Float32 vs
> Float in Julia for your algorithm when you finish what you are working
> on?
>
> My experience on modern x86 architectures is that CPU handles both at
> the approximately same speeds, and when I have big matrices the speed
> benefit of single float comes from using less memory, but that is not
> worth dealing with the subtle problems that come from loss of precision
> (in particular, it is very easy to run into conditioning problems with
> single float).
>
> Best,
>
> Tamas
>
>
> On Wed, Oct 15 2014, Hubert Soyer <[email protected]> wrote:
>
>> Sorry for coming back to this so late. I forgot to subscribe to get updates
>> via email and thought nobody had replied yet.
>> Thank you for all the comments, I think I get the point.
>>
>> I am coming from a python background and was and am still working with a
>> module called Theano that provides automatic differentiation and is used a
>> lot for neural networks.
>> This module offers an environment variable (floatX) that lets me specify
>> whether I want to use Float64 or Float32 all the way.
>>
>> My workflow would then be:
>> Prototype with Float32 to get the speed benefit.
>> When I want to have a more serious look at my results, I switch to Float64
>> to be safe.
>>
>> So I thought I'd ask this question and if it turns out that there is a
>> switch like that in Julia, I could just use it.
>> I do understand that this was just a convenient "hack" for me and it will
>> definitely work without that functionality.
>> But in case it existed but just wasn't documented, I thought I'd ask.
>>
>> Thank you a lot for your comments, Pontus' solution seems to cover my use
>> case just fine, I think I'll go with that.
>>
>> Best,
>>
>> Hubert
>



-- 
Erik Schnetter <[email protected]>
http://www.perimeterinstitute.ca/personal/eschnetter/

Reply via email to