On Fri, 19 Dec 2008 03:48:44 -0800, Bill Baxter <[email protected]> wrote:
On Fri, Dec 19, 2008 at 8:16 PM, Sergey Gromov <[email protected]>
wrote:
Fri, 19 Dec 2008 03:56:37 +0000 (UTC), dsimcha wrote:
Does anyone know of a decent guide that has information on what types
of
optimizations compilers typically perform and what they aren't capable
of
performing automatically? I figure it would be useful to know
something like
this because, when micro-optimizing performance-critical code, it's
silly to
do a low-level optimization that makes the code less readable if the
compiler's already probably doing it. On the other hand, it would be
nice to
know exactly what optimizations (besides the obvious stuff like
high-level
design and algorithm optimizations) the compiler can't possibly be
sufficiently smart to do, so I can spend time looking for
opportunities to do
those.
I'm afraid that your only option is to look at the assembly output and
then trick the compiler into making it better. Very subtle changes
make difference sometimes and I doubt you can pick this knowledge from
any manual.
Sure but there are some things that aren't so subtle that the
optimizer probably gets right all the time, or some that it gets right
none of the time.
For instance if I need to divide more than one number by the same
thing I usually do
auto invx = 1.0/x;
first *= invx;
second *= invx;
It would be nice to know that I was wasting my time and could just
write first/=invx; second/=invx and it would be optimized to one
divide.
I haven't seen I compiler yet to do that optimization and DMD 1.038
doesn't:
foreach(ref val; array) val /= x;
is ten times slower than
foreach(ref val; array) val *= invx;
on a 100,000 element array.
Another thing I do a lot is worry about whether to introduce a new
variable just to make the naming of a variable more accurate. Like
doing this:
float sum = 0; foreach(x; nums) sum+=x; float avg = sum /
nums.length;
Instead of the slightly less explicit, but potentially less stack using:
float avg = 0; foreach(x; nums) avg+=x; avg/=nums.length;
I bet the compiler is smart enough to eliminate 'avg' if I don't use
'sum' afterwards, but I'm not sure.
Most compilers do this with single variables, but if avg was a struct then
the later is better than the former on at least one compiler I used (nvcc,
an Open64 derivative) and the difference is important on GPUs at least.
Anyway I don't think it makes
enough difference to my program's speed to be worth investigating.
But still it would be nice to know whether or not *in general* that
little micro-optimization is useless or not.
--bb
My general rule of thumb is than the optimizer is pretty good within a
single code block and less so across blocks. And there's really no
substitute for profiling and testing the user-optimization to see if it's
actually worthwhile.