The following the O2 size data from SPEC2k. Note that with push/pop,
it is a always a net win (negative delta) in terms of total binary or
total loadable section size.
thanks,
David
.text .eh_frame Total_binary
vortex-move 440252 40796 584066
vortex-push 415436 57452 575906
delta -5.6% 40.8% -1.397%
twolf-move 169324 10748 223521
twolf-push 168876 11124 223449
delta -0.3% 3.5% -0.032%
gzip-move 30668 3652 374399
gzip-push 30524 3740 374343
delta -0.5% 2.4% -0.015%
bzip2-move 22748 3196 111616
bzip2-push 22636 3284 111592
delta -0.5% 2.8% -0.022%
vpr-move 104684 9380 147378
vpr-push 104236 9788 147338
delta -0.4% 4.3% -0.027%
mcf-move 8444 1244 26760
mcf-push 8444 1244 26760
delta 0.0% 0.0% 0.000%
cc1-move 1093964 90772 1576994
cc1-push 1078988 104068 1575314
delta -1.4% 14.6% -0.107%
crafty-move 130556 5508 1256037
crafty-push 130236 5772 1255981
delta -0.2% 4.8% -0.004%
eon-move 333660 33220 516491
eon-push 330140 35812 515555
delta -1.1% 7.8% -0.181%
gap-move 404092 46732 1457735
gap-push 396012 53180 1456103
delta -2.0% 13.8% -0.112%
perlbmk-move 456572 45324 618585
perlbmk-push 449516 52340 618545
delta -1.5% 15.5% -0.006%
parser-move 81244 15788 334003
parser-push 80684 16332 333987
delta -0.7% 3.4% -0.005%
On Tue, Dec 11, 2012 at 9:14 AM, Xinliang David Li <[email protected]> wrote:
> On Tue, Dec 11, 2012 at 1:49 AM, Richard Biener
> <[email protected]> wrote:
>> On Mon, Dec 10, 2012 at 10:07 PM, Mike Stump <[email protected]> wrote:
>>> On Dec 10, 2012, at 12:42 PM, Xinliang David Li <[email protected]> wrote:
>>>> I have not measured the CFI size impact -- but conceivably it should
>>>> be larger -- which is unfortunate.
>>>
>>> Code speed and size are preferable to optimizing dwarf size… :-) I'd let
>>> dwarf 5 fix it!
>>
>> Well, different to debug info, CFI data has to be in memory to make
>> unwinding work.
>> These days most Linux distributions enable asyncronous unwind tables so any
>> size savings due to shorter push/pop epilogue/prologue sequences has to be
>> offsetted by the increase in CFI data. I'm not sure there is really a
>> speed difference
>> between both variants (well, maybe due to better icache footprint of
>> the push/pop
>> variant).
>
> Yes, for large applications, this can be crucial to performance.
>
>>
>> That said - I'd prefer to have more data on this before making the switch for
>> the generic model. What was your original motivation? Just "theory" or was
>> it a real case?
>
> 1) some of the very large internal apps I measured benefit from this
> change (in terms of performance)
> 2) both ICC and LLVM do the same.
>
> I have already committed the patch. I will find some time to collect
> more size data and post it later.
>
> thanks,
>
> David
>
>
>>
>> Thanks,
>> Richard.