[computer-go] Re: OpenMP / Quad Core experiments

Hideki Kato Tue, 01 Jan 2008 19:06:13 -0800

Happy New Year, all.

terry mcintyre: <[EMAIL PROTECTED]>:
>I have been tinkering with OpenMP and my new HP Quad
>Intel 6600. Wrote a small program to compute the
>Taylor series of e and pi, just for exploration, and
>I've found some interesting data points.
>
>I am using gcc 4.2 and 4.3.1 - the latter being the
>head of the SVN repository. Kubuntu 7.10, both 32 and
>64 bit versions. One of my test programs is attached.
>
>Oddly, the OpenMP version is no faster than the
>single-threaded version - but it does keep the cores
>busier. It is possible that I am doing something
>wrong, as I am new to OpenMP.
>
>I was so puzzled by the results that I tried the same
>program on my AMD Athlon X2. The older AMD Athlon duo,
>with a 1 GHz clock, 64-bit Fedora Core 7, is 20%
>faster than the 1.6GHz quad 6600. I've also run the
>--monte-carlo version of GnuGo 4.7.11 on both
>machines, with similar results.
>
>The compilation line is:
>gcc -Wall -fopenmp -O3 -march=native -lgomp taylor3.c
>-o taylor3
>
>( the code is an adaptation of code from the OpenMP
>tutorial at http://kallipolis.com/openmp/ - which
>leads to another interesting discovery. The original
>code yields incorrect results for pi; the two parallel
>branches use the same index variable i, 
>and one stomps on the other. Is this a feature of the
>gcc version of OpenMP? I'll be testing Intel's icc
>soon. )
>
>I'll be doing more testing this weekend, but I'd like
>to know if anyone has compared the Intel 6600 to other
>processors. So far, it sure looks like a tired old nag
>on her last ride to the glue factory; I'm wishing that
>I had waited for the Penryn version.
>
>One more puzzle: this processor is rated at 2.4GHz,
>but cpuinfo tells a different story:


It's because SpeedStep is working.  You can stop it in BIOS setting.
http://en.wikipedia.org/wiki/SpeedStep

-Hideki

>[EMAIL PROTECTED]:/proc$ cat cpuinfo
>processor       : 0
>vendor_id       : GenuineIntel
>cpu family      : 6
>model           : 15
>model name      : Intel(R) Core(TM)2 Quad CPU    Q6600
> @ 2.40GHz
>stepping        : 11
>cpu MHz         : 1596.000
>cache size      : 4096 KB
>physical id     : 0
>siblings        : 4
>core id         : 0
>cpu cores       : 4
>fpu             : yes
>fpu_exception   : yes
>cpuid level     : 10
>wp              : yes
>flags           : fpu vme de pse tsc msr pae mce cx8
>apic sep mtrr pge mca cmov pat pse36 clflush dts acpi
>mmx fxsr sse sse2 ss ht tm syscall nx lm constant_tsc
>pni monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr lahf_lm
>bogomips        : 4804.08
>clflush size    : 64
>cache_alignment : 64
>address sizes   : 36 bits physical, 48 bits virtual
>power management:
>
>
>Terry McIntyre &lt;[EMAIL PROTECTED]&gt;
>
>Wherever is found what is called a paternal government, there is found state 
>education. It
>has been discovered that the best way to insure implicit obedience is to 
>commence tyranny in
>the nursery.
>
>Benjamin Disraeli, Speech in the House of Commons [June 15, 1874]
>
>
>      
> ____________________________________________________________________________________
>Never miss a thing.  Make Yahoo your home page. 
>http://www.yahoo.com/r/hs
>/*
> * taylor.c
> *
> * calculate e and pi by their taylor expansions and multiply them
> * together.
> *
> * moved local variables inside parallel blocks ( performance tweak? )
> */
>
>#include <omp.h>
>#include <stdio.h>
>#include <time.h>
>
>#define num_steps 20000000
>
>int main(int argc, char *argv[])
>{
>  double start, stop; /* times of beginning and end of procedure */
>  double efinal, pifinal, product;
>
>  /* start the timer */
>  start = clock();
>
>  /* calculate e and pi in parallel */
>#pragma omp parallel sections shared(efinal,pifinal)
>  {
>#pragma omp section 
>    { /* calculate e using Taylor approximation */
>      register double e, factorial;
>      register int j;
>
>      e = 1;
>      factorial = 1; 
>      for (j = 1; j<num_steps; j++) {
>       factorial *= j;
>       e += 1.0/factorial;
>      }
>      efinal=e;
>    } /* e section */
>
>#pragma omp section
>    { /* calculate pi expansion */
>      register int i;
>      register double pi;
>
>      pi = 0;
>      for (i = 0; i < num_steps*10; i++) {
>       /* we want 1/1 - 1/3 + 1/5 - 1/7 etc.
>          therefore we count by fours (0, 4, 8, 12...) and take
>             1/(0+1) =  1/1
>          - 1/(0+3) = -1/3
>             1/(4+1) =  1/5
>          - 1/(4+3) = -1/7 and so on */
>       pi += 1.0/(i*4.0 + 1.0);
>       pi -= 1.0/(i*4.0 + 3.0);
>      }
>      pi = pi * 4.0;
>      pifinal=pi;
>    } /* pi section */
>    
>  } /* omp sections */
>  /* threads rejoin here */
>
>  product = efinal * pifinal;
>
>  stop = clock();
>
>  printf("e %f pi %f products =  %f reached in %.3f seconds\n", efinal, 
> pifinal, product,
>(double)(stop-start)/CLOCKS_PER_SEC);
>
>  return 0;
>}
>---- inline file
>_______________________________________________
>computer-go mailing list
>[email protected]
>http://www.computer-go.org/mailman/listinfo/computer-go/
--
[EMAIL PROTECTED] (Kato)
_______________________________________________
computer-go mailing list
[email protected]
http://www.computer-go.org/mailman/listinfo/computer-go/

[computer-go] Re: OpenMP / Quad Core experiments

Reply via email to