The more you optimize, the better the odds you slow your program down. Optimization adds instructions and often data, in one of the paradoxes of engineering. In time, then, what you gain by "optimizing" increases cache pressure and slows the whole thing down.
C++ inlines a lot because microbenchmarks improve, but inline every modest function in a big program and you make the binary much bigger and blow the i-cache. -rob
