Thomas Grill wrote: > Hi, > here's my results: > > Intel Core 2 Duo, 2.16GHz, 667MHz bus, 4MB Cache > running under OSX 10.5.2 > > please note that the auto-vectorizer of gcc-4.3 is doing really well.... > > gr~~~ > > --------------------- > > gcc version 4.0.1 (Apple Inc. build 5465) > > xbook-2:temp thomas$ gcc -msse -O2 vec_bench.c -o vec_bench > xbook-2:temp thomas$ ./vec_bench > Testing methods... > All OK > > Problem size Simple Intrin > Inline > 100 0.0002ms (100.0%) 0.0001ms ( 83.2%) 0.0001ms ( > 85.1%) > 1000 0.0014ms (100.0%) 0.0014ms ( 99.5%) 0.0014ms ( > 97.6%) > 10000 0.0180ms (100.0%) 0.0137ms ( 76.1%) 0.0103ms ( > 56.9%) > 100000 0.1307ms (100.0%) 0.1153ms ( 88.2%) 0.0952ms ( > 72.8%) > 1000000 4.0309ms (100.0%) 4.1641ms (103.3%) 4.0129ms ( > 99.6%) > 10000000 43.2557ms (100.0%) 43.5919ms (100.8%) 42.6391ms ( > 98.6%) > > > > gcc version 4.3.0 20080125 (experimental) (GCC) > > xbook-2:temp thomas$ gcc-4.3 -msse -O2 vec_bench.c -o vec_bench > xbook-2:temp thomas$ ./vec_bench > Testing methods... > All OK > > Problem size Simple Intrin > Inline > 100 0.0002ms (100.0%) 0.0001ms ( 77.4%) 0.0001ms ( > 72.0%) > 1000 0.0017ms (100.0%) 0.0014ms ( 84.4%) 0.0014ms ( > 79.4%) > 10000 0.0173ms (100.0%) 0.0148ms ( 85.4%) 0.0104ms ( > 59.9%) > 100000 0.1276ms (100.0%) 0.1243ms ( 97.4%) 0.0952ms ( > 74.6%) > 1000000 4.0466ms (100.0%) 4.1168ms (101.7%) 4.0348ms ( > 99.7%) > 10000000 43.1842ms (100.0%) 43.2989ms (100.3%) 44.2171ms > (102.4%) > > xbook-2:temp thomas$ gcc-4.3 -msse -O2 -ftree-vectorize vec_bench.c -o > vec_bench xbook-2:temp thomas$ ./vec_bench > Testing methods... > All OK > > Problem size Simple Intrin > Inline > 100 0.0001ms (100.0%) 0.0001ms (126.6%) 0.0001ms > (120.3%) > 1000 0.0011ms (100.0%) 0.0014ms (136.3%) 0.0014ms > (127.9%) > 10000 0.0144ms (100.0%) 0.0153ms (106.3%) 0.0103ms ( > 72.0%) > 100000 0.1027ms (100.0%) 0.1243ms (121.0%) 0.0953ms ( > 92.8%) > 1000000 3.9691ms (100.0%) 4.1197ms (103.8%) 4.0252ms > (101.4%) > 10000000 42.1922ms (100.0%) 43.6721ms (103.5%) 43.4035ms > (102.9%) gcc version 4.3.0 20080307 (Red Hat 4.3.0-2) (GCC) gcc -msse -O2 -ftree-vectorize vec_bench.c -o vec_bench mock-chroot> ./vec_bench Testing methods... All OK
Problem size Simple Intrin Inline 100 0.0001ms (100.0%) 0.0001ms (141.6%) 0.0001ms (108.0%) 1000 0.0008ms (100.0%) 0.0011ms (149.9%) 0.0008ms (100.4%) 10000 0.0135ms (100.0%) 0.0197ms (145.8%) 0.0133ms ( 98.8%) 100000 0.6415ms (100.0%) 0.4918ms ( 76.7%) 0.5052ms ( 78.8%) 1000000 7.5364ms (100.0%) 7.9987ms (106.1%) 7.4832ms ( 99.3%) 10000000 76.3927ms (100.0%) 76.8933ms (100.7%) 75.1002ms ( 98.3%) model name : AMD Athlon(tm) 64 Processor 3200+ stepping : 10 cpu MHz : 2000.068 cache size : 1024 KB Now same, but with gcc --version gcc (GCC) 4.1.2 20070925 (Red Hat 4.1.2-33) Testing methods... All OK Problem size Simple Intrin Inline 100 0.0002ms (100.0%) 0.0001ms ( 77.2%) 0.0001ms ( 58.7%) 1000 0.0015ms (100.0%) 0.0011ms ( 73.5%) 0.0008ms ( 52.6%) 10000 0.0214ms (100.0%) 0.0195ms ( 90.9%) 0.0363ms (169.3%) 100000 0.6620ms (100.0%) 0.5614ms ( 84.8%) 0.5527ms ( 83.5%) 1000000 7.5975ms (100.0%) 7.3826ms ( 97.2%) 7.3380ms ( 96.6%) 10000000 75.8361ms (100.0%) 84.0476ms (110.8%) 77.2884ms (101.9%) _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion