Thank you, Tony. I just compiled with -m32, yes, it made a difference: from 8m40s to 2m51s, on that box. Is there any way to effect globally such that all calls made to qsort() make use of 32bit stuff.
On Feb 19, 2008 11:25 AM, Tony Schreiner <[EMAIL PROTECTED]> wrote: > In my case, it seems to a combination of RHEL/CentOS and 64 bit > Pentium 4 that is much slower > > even compiling the program with -m32 on the 64-bit CentOS improves > matters greatly > > I'm comparing > > 32-bit 1.26 GHz Pentium III, Centos 5 > 64-bit 2.8 GHz Pentium 4, CentOS 4 > 64-bit 2.8 GHz Pentium 4, CentOS 5 > > 32-bit 3.0 GHz Pentium 4, Fedora 8 > 64-bit 2.6 GHz Opteron, RHEL 4 > > 1. > 32-bit fedora 8> time sorttest 20 > real 2m46.315s (164.315) > user 2m44.485s > sys 0m1.805s > > 2. > 2.8 GHz Pentium 4, CentOS 5, glibc-2.5-18.el5_1.1.x86_64 (64bit) > /usr/bin/time --portability sorttest 20 > real 490.02 > user 488.89 > sys 1.03 > > 3. > 2.8 GHz Pentium 4, CentOS 4, glibc-2.3.4-2.39.x86_64 (64bit) > /usr/bin/time --portability sorttest 20 > real 494.07 > user 492.67 > sys 1.21 > > 4. > 1.26 GHz Pentium III, CentOS 5, glibc-2.5-18.el5_1.1.i686 (32bit) > /usr/bin/time --portability sorttest 20 > real 259.24 > user 257.40 > sys 1.75 > > 5. > 2.6 GHz Opteron 285, RHEL 4, glibc-2.3.4-2.25.x86_64 (64-bit) > /usr/bin/time sorttest 20 > real 2m6.252s (126.25) > user 2m5.316s > sys 0m0.676s > > and finally repeat 2. but compile with -m32 > 2.8 GHz Pentium 4, CentOS 5, glibc-2.5-18.el5_1.1.i686 (32bit) > /usr/bin/time --portability sorttest 20 > real 141.29 > user 140.27 > sys 0.93 > > > The 1.2 GHz P3 is almost twice as fast as the the 2.8 GHz P4 for this > case. and compiling the program -m32 on a 64 bit system also improves > matters. > > Moreover: I put clock statements into the program as below: most > of the time difference occurs during the qsort() call, though I'm not > clear if it's the qsort itself or the time spent in srt_f(). I > suspect the latter. > > /* compile: cc sorttest.c -O3 -lm -o sorttest */ > /* Run: ./sorttest 20 */ > #include <stdio.h> > #include <stdlib.h> > #include <math.h> > #include <time.h> > > #define N 10000000 /* vector of 10-M */ > > int scr[N], i, cnt, n, posx[N], srt_f(const void *, const void *); > > double a, aaa, posy[N]; > > int srt_f(const void *a, const void *b) > { > aaa = posy[*((int *)a)] - posy[*((int *)b)]; > if ( aaa < 0. ) return(-1); > return( aaa > 0. ); > } > > int main(int argc, char *argv[]) > { > clock_t ts, te; > cnt = ( argc == 1 ) ? 1 : atoi(argv[1]); > for ( n = 0; n < cnt; ++n ) > { > ts = clock(); > printf("%d:\n",n); > for ( i = 0; i < N; ++i ) > { > posx[i] = i; > a += .001; > posy[i] = sin(a); > } > te = clock(); > printf(" generate: %4.4f\n",(double)(te-ts)/(double) > CLOCKS_PER_SEC); > ts = te; > qsort((void *) posx, i, sizeof(i), srt_f); > te = clock(); > printf(" sort: %4.4f\n",(double)(te-ts)/(double)CLOCKS_PER_SEC); > ts = te; > for ( i = 0; i < N; ++i ) > scr[posx[i]] = (1000*i)/N; > te = clock(); > printf(" save: %4.4f\n",(double)(te-ts)/(double)CLOCKS_PER_SEC); > } > } > > > Tony Schreiner > > > > > > _______________________________________________ > rhelv5-list mailing list > rhelv5-list@redhat.com > https://www.redhat.com/mailman/listinfo/rhelv5-list > _______________________________________________ rhelv5-list mailing list rhelv5-list@redhat.com https://www.redhat.com/mailman/listinfo/rhelv5-list