Hi Robert, On Sonntag, 17. Dezember 2017 10:03:22 CET Robert Helling wrote: > Hi, > > > On 16. Dec 2017, at 20:29, Lubomir I. Ivanov <[email protected]> wrote: > > > > so yes, "float" is like a non-default sized type e.g. "short", yet > > one's variable might fit in a "short" and it's perfectly fine to use > > the non-default type. > > while nowadays we don't have much of a size restrain using the smaller > > type, it could be a good indication that we don't actually need the > > whole 64bit precision for this variable. > > the same way one could use a "char" vs an "int". > > > > on the other hand, if this causes more trouble than benefits, it's > > probably best to just convert all the calculations to be with > > "doubles", have consistency and silence the -Wall warnings. > > from my very limited understanding: RAM is indeed big enough, so there is no > size constraint coming from there. But what is actually finite is CPU cache > size. So when the doubling in memory footprint makes the difference for > some calculation to fit in the cache vs the CPU has to fetch the data from > RAM several times the performance will hugely differ. Of course I have no > idea at all if this is any concern of us (not having done any significant > performance tuning ever). Plus there are these vector instructions where > the CPU can do a number of floating point instructions on different data in > parallel: There the number of things done at once is twice for float than > for double (since twice as many floats fit in the CPU). Again no idea at > all if the compiler generates this parallelism for us at any critical > point.
This is of course all correct, but none of the proposed changes are anywhere close to in a tight loop. In some cases it's only a local variable which is not even spilled to RAM. I tried to look at the assembler output and the first thing I noted is that we don't compile with -O2? Is that on purpose? Anyway, with -O3, the first case in plannernotes.h produces better code without casting to float. You could of course get the same result by replacing 1000.0 with 1000.0f, but this seems rather obscure. I removed two cases concerning binary representation from the patch and made a PR. Perhaps we get Linus to have a look at it. Berthold _______________________________________________ subsurface mailing list [email protected] http://lists.subsurface-divelog.org/cgi-bin/mailman/listinfo/subsurface
