On Tue, 04 Apr 2000, David Benson wrote:
> how much data are you copying?  is it being
> copied once or multiple times?

use the source , luke :-)

Anyway I am using input and output buffers
of 1024 elements , but always the same buffers, that
means we are timing CACHED float/double --> float/double assignments

> 
> if it is just being copied once, you're probably seeing
> a lot of memory-cache misses.  in real situations, the
> buffers tend to be small and reused to minimize latency,
> which has the right cache behavior.

exactly, that is why I decided to use pretty small buffers (1024 x
sizeof(double) = 8Kbytes, which fits nicely in the 2nd level cache)

> 
> [am i correct in assuming the double-double/float-float copies
> were normalized to the same number of bytes?]

normalized to the same number of elements not bytes.

that means:
float->float    4 bytes in + 4 bytes out  = 8bytes
float->double 4bytes in + 8 bytes out   =12 bytes
double->float  8bytes in + 4 bytes out  = 12 bytes
double->double 8bytes in + 8 bytes out   = 16 bytes

but when doing DSP stuff  are interested in the number of operations per second
(FLOPS) , not in the number of bytes moved.
( in the double->double case we achieve the biggest bytes/sec performance)

Benno.

Reply via email to