Here are results under 64-bit linux using gcc-4.3 (which by default turns on the various sse flags). Note that -O3 is significantly better than -O2 for the "simple" calls:
nimrod:~$ cat /proc/cpuinfo | grep "model name" | head -1 model name : Intel(R) Xeon(R) CPU E5450 @ 3.00GHz nimrod:~$ gcc-4.3 --version gcc-4.3 (Debian 4.3.0-1) 4.3.1 20080309 (prerelease) nimrod:~$ gcc-4.3 -O2 vec_bench.c -o vec_bench nimrod:~$ ./vec_bench Testing methods... All OK Problem size Simple Intrin Inline 100 0.0001ms (100.0%) 0.0001ms ( 70.8%) 0.0001ms ( 74.3%) 1000 0.0008ms (100.0%) 0.0006ms ( 70.3%) 0.0007ms ( 80.3%) 10000 0.0085ms (100.0%) 0.0061ms ( 72.0%) 0.0067ms ( 78.8%) 100000 0.0882ms (100.0%) 0.0627ms ( 71.1%) 0.0677ms ( 76.7%) 1000000 3.6748ms (100.0%) 3.3312ms ( 90.7%) 3.3139ms ( 90.2%) 10000000 37.1154ms (100.0%) 35.9762ms ( 96.9%) 36.1126ms ( 97.3%) nimrod:~$ gcc-4.3 -O3 vec_bench.c -o vec_bench nimrod:~$ ./vec_bench Testing methods... All OK Problem size Simple Intrin Inline 100 0.0001ms (100.0%) 0.0001ms (111.1%) 0.0001ms (116.7%) 1000 0.0005ms (100.0%) 0.0006ms (111.3%) 0.0007ms (126.8%) 10000 0.0056ms (100.0%) 0.0061ms (108.6%) 0.0067ms (118.9%) 100000 0.0581ms (100.0%) 0.0626ms (107.8%) 0.0677ms (116.5%) 1000000 3.4549ms (100.0%) 3.3339ms ( 96.5%) 3.3255ms ( 96.3%) 10000000 34.8186ms (100.0%) 35.9767ms (103.3%) 36.1099ms (103.7%) nimrod:~$ ./vec_bench_dbl Testing methods... All OK Problem size Simple Intrin 100 0.0001ms (100.0%) 0.0001ms (132.5%) 1000 0.0009ms (100.0%) 0.0012ms (134.5%) 10000 0.0119ms (100.0%) 0.0124ms (104.1%) 100000 0.1226ms (100.0%) 0.1276ms (104.1%) 1000000 7.0047ms (100.0%) 6.6654ms ( 95.2%) 10000000 70.0060ms (100.0%) 71.9692ms (102.8%) nimrod:~$ gcc-4.3 -O3 vec_bench_dbl.c -o vec_bench_dbl nimrod:~$ ./vec_bench_dbl Testing methods... All OK Problem size Simple Intrin 100 0.0001ms (100.0%) 0.0002ms (289.8%) 1000 0.0007ms (100.0%) 0.0012ms (172.7%) 10000 0.0114ms (100.0%) 0.0124ms (109.4%) 100000 0.1159ms (100.0%) 0.1278ms (110.3%) 1000000 6.9252ms (100.0%) 6.6585ms ( 96.1%) 10000000 69.1913ms (100.0%) 71.9664ms (104.0%) On Sat, Mar 22, 2008 at 06:41:31PM -0600, Charles R Harris wrote: > On Sat, Mar 22, 2008 at 6:34 PM, Charles R Harris <[EMAIL PROTECTED]> > wrote: > > I've attached a double version. Compile with > gcc -msse2 -mfpmath=sse -O2 vec_bench_dbl.c -o vec_bench_dbl > > Chuck > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion@scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion -- -- Scott M. Ransom Address: NRAO Phone: (434) 296-0320 520 Edgemont Rd. email: [EMAIL PROTECTED] Charlottesville, VA 22903 USA GPG Fingerprint: 06A9 9553 78BE 16DB 407B FFCA 9BFA B6FF FFD3 2989 _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion