I've tried to measure the differences by using a simple C program which simulates both behaviors. I'm no compiler writer and I didn't disassemble the compiled code, so I'm not sure that this really proves anything, but in case somebody is interrested, here are my results:
- Both test cases iterated over numbers from 2 to 100000 trying to find primes using a stupid, simple algorithm. - Both test cases hold no temporary variables on the stack, but instead use a malloc'd structure for that. - Test A uses direct access to the struct's fields. - Test B uses indirect access by adding to the struct's pointer a static global variable, which holds the offset to the fields. - Both tests were tried with no compiler optimization (-O0) and with maximum compiler optimization (-O3 -fomit-frame-pointer). The results are: Machine 1: Intel Celeron M @ 1.5GHz, x86-32, GCC 4.1.3 20070929 (prerelease) Test A (unoptimized): 5466612us Test B (unoptimized): 11096380us Slowdown: 102.9800% Test A (optimized): 5280693us Test B (optimized): 5704486us Slowdown: 8.0200% Machine 2: Intel Pentium 4 @ 1.7GHz, x86-32, GCC 4.1.2 20061115 (prerelease) Test A (unoptimized): 11228890us Test B (unoptimized): 17972084us Slowdown: 60.0500% Test A (optimized): 15903029us Test B (optimized): 16032732us Slowdown: .8100% Machine 3: Intel Pentium Dual-Core E2180 @ 2.0GHz, x86-64, GCC 4.2.3 Test A (unoptimized): 3646244us Test B (unoptimized): 5469573us Slowdown: 50.0000% Test A (optimized): 3680493us Test B (optimized): 3630129us Slowdown: -1.3700% Machine 4: Intel Core 2 Duo E6600 @ 2.4GHz, x86-64, GCC 4.2.3 Test A (unoptimized): 3066969us Test B (unoptimized): 4653554us Slowdown: 51.7300% Test A (optimized): 3132884us Test B (optimized): 3070829us Slowdown: -1.9900% I've attached a tarball holding the program and test script. Hope it's of any use. -- Saso Richard Frith-Macdonald wrote: > > On 31 May 2008, at 16:21, David Chisnall wrote: >> >> The advantages of this would be: >> >> - No code using GNUstep or other frameworks compiled with clang/LLVM >> (which we are almost in a position to do) would break if it inherited >> from a class whose layout changed. >> >> - No ABI breakage would be needed - code compiled with GCC would >> still work on the modified runtime, although the existing constraints >> on modification would still apply. >> >> The disadvantages are: >> >> - Currently ivar accesses on most platforms will be a single load / >> store instruction in an indirect addressing mode with a constant >> offset embedded in the instruction. This would add another load and >> addition to every ivar access. >> >> - The extra work that the runtime would do would increase load times >> slightly. >> >> So, my questions is, is this worth doing? > > IMO ... yes. It's a good feature to have, and the overheads get more > insignificant as processor seeds increase. > > > _______________________________________________ > Discuss-gnustep mailing list > [email protected] > http://lists.gnu.org/mailman/listinfo/discuss-gnustep
test.tar.gz
Description: GNU Zip compressed data
_______________________________________________ Discuss-gnustep mailing list [email protected] http://lists.gnu.org/mailman/listinfo/discuss-gnustep
