then comes L1/L2/L3 caches occupancy at moment of doing "ref unit" and so on.
Performance depends from workload, code opt level and even from room 
temperature (in case of notebook with SpeedStep activated ) and no benchmarks 
can change this.

  ----- Original Message ----- 
  From: Nicolás Alvarez 
  To: [email protected] 
  Sent: Tuesday, September 22, 2009 12:11 AM
  Subject: Re: [boinc_dev] [boinc_alpha] Card Gflops in BOINC 6.10


  El Lunes 21 Sep 2009 15:04:08 Lynn W. Taylor escribió:
  > It's hard to find a proper table (they used to be published), but this
  > may help illustrate the problem:
  >
  > <http://www.obliquity.com/computer/speedtest.html>
  >
  > If the example program on the page is accurate (and I suspect it is) a
  > floating point add is roughly three times faster than a floating point
  > divide, and more than ten times faster than a floating point cosine.
  >
  > Yet, we count all floating point instructions as one flop.
  >
  > A different processor might be slow to add, or do cosine relatively fast.

  And what about pipelines? Adding 10 numbers and *then* calculating 10 cosines 
  may be slower than interleaving the adds and cosines, because in the latter 
  case there is a higher chance the CPU can do both operations at the same 
  time.

  -- 
  Nicolas
  _______________________________________________
  boinc_dev mailing list
  [email protected]
  http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
  To unsubscribe, visit the above URL and
  (near bottom of page) enter your email address.

_______________________________________________
boinc_dev mailing list
[email protected]
http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
To unsubscribe, visit the above URL and
(near bottom of page) enter your email address.

Reply via email to