Neil Hodgson wrote:
Robert Roessler:
For me, *every* [MS] compiler upgrade in the past has seen REDUCTIONS
in code size as the code generation has gotten smarter (v4 > v5 > v6 >
vs2003 < vs2005). Grrr. This is with the same options (primarily
Lua, static linking, smallest code size and limited inlining... and
[formerly] PII/PIII code generation).
The too biggest obvious "official" changes are the new "safe"
libraries, which of course are larger - not unreasonable. A trivial
app which uses printf and atoi went from 28.5 KB to 47 KB - again,
statically linked. So we get an idea that using some of the most
common portions of the CRT will add an extra ~20 KB.
BUT an 83 KB increase on a 458 KB app???
Bytes aren't as precious as they used to be, especially on disk
size. This may or may not translate into a changed in-memory size
although since its a proportional change rather than just an addition,
it probably does go into memory. Safety is one of the better reasons
for allowing some expansion but there may also be some more default
inlining of functions or loop unrolling which can increase bulk in
pursuit of speed.
So vs 2005 is what you use for builds of Scintilla/SciTE, and your
results are consistent with mine? It is tempting to take the position
that SciTE's bloat under the new order of things is no worse than any
other app...
... BUT bloat, especially of this magnitude, still can cause major
performance grief - because of extra paging. ONE [formerly unneeded]
pagein can sure wipe out a lot of "gain" from some unrolled loops (or
whatever is causing an overall [non-library] increase in size)... :(
The other change (relating to code generation/size) is that the
intended target architecture model can no longer be specified... this
may be particularly unfortunate, since my own timing tests on my Core
2 Duo were showing [as expected] that the older PII/PIII model was a
better match performance-wise than the P4/Athlon models.
You do get to choose whether to use SSE or SSE2. I expect it just
became too much work to tailor optimization for multiple archaic
chips.
Of course you can select the desired vector instruction set (or none)
- but the good old G6 and friends options are gone. Archaic? To
anticipate the next round(s) of this discussion, I would say
"scheduling for max overlap between memory accesses and execution
still matters!" and you would say "but we have monster smart caches
now" and then we would both say "the gains from optimal scheduling
pale against the losses from cache misses anyway". Sigh. I suppose
you're right. ;)
Robert Roessler
[EMAIL PROTECTED]
http://www.rftp.com
_______________________________________________
Scite-interest mailing list
[email protected]
http://mailman.lyra.org/mailman/listinfo/scite-interest