+++ Matt Sealey [2010-07-15 16:39 -0500]: > Hi Paul > > Please understand we know what we're talking about here :D
And so does Paul :-) > In summary: > > soft > * FPU is all emulated. FPU work is done in integer registers. > > softfp+vfp > * actual FPU used, FPU argument passing done in integer registers due > to the soft/softfp EABI spec. your 10x speedup is here and comes from > using the FPU instead of emulating it > * You can use NEON here but you still are limited to passing float > arguments in integer registers per the ABI > * Each register transfer from integer to float register costs about 20 cycles > * Boost in performance from using the FPU or NEON instead of emulation > * Hidden performance penalty from the register transfers > * Compatible with the above - soft and softfp code can be mixed > > > hard+vfp > * actual FPU is used in the same way > * actual FPU code does not run faster > * Boost in performance from using the FPU or NEON is the same > * No hidden performance penalty > * Completely incompatible ABI with the two above - no code mixing. > > > That is what we're proposing. Thanks for that clear and concise summary. > This, coupled with the benefits of > compiling for an improved ISA (ARMv7-A instead of ARMv4) armel (Debian) is actually v4t. v4 was not supported (too hard, only Strongarm thus disenfranchised) > I am fairly sure (oh you did!) find a contrived benchmark to show that > some code is faster on softfp in some cases, but taking a holistic > approach I find it hard to believe that every time a floating point > function is called across any of 20,000 packages possibly running on a > system in a Debian port, that you will be able to benchmark a > softfp+vfp system running faster than a hard+vfp one, This remains a crucial question. If Paul is right then maybe it doesn't actually make as much difference as you think. If we have both of these builds then it shouldn't be too hard to measure their relative performances. Ubuntu's existing armel flavour, (with softfp+vfp+thumb2 (v6.5) (I think)) is close to the necesary direct comparison with your hardfp+vfp+ARM port. The arm/thumb2 thing clouds the waters somewhat, and a genuinely equivalent comparison would be good. (I was under the impression that thumb2 was usally faster in practice - you may want to build for that in fact? Although I note the PB is not convinced on that point). > Anyway I think everyone is agreed on that it should be done, just not the > name.. Well, right at the beginning of this discussion a number of people said 'I'd like to see the numbers'. That remains true, and the results will help determine what flavours are woprth maintaining in the long term. But of course in order to do that someone has to build the necessary flavours/ports, so yes please, go for it. We await results with bated breath/ We need some good benchmarks too. https://blueprints.launchpad.net/ubuntu/+spec/arm-m-ui-and-test-heads is relevant to that I guess. Wookey -- Principal hats: Linaro, Emdebian, Wookware, Balloonboard, ARM http://wookware.org/ -- To UNSUBSCRIBE, email to [email protected] with a subject of "unsubscribe". Trouble? Contact [email protected] Archive: http://lists.debian.org/[email protected]

