Hi All,

It was suggested off list that I gather some simple benchmarks
comparing the different ways of passing a struct with a contiguous
array pointer inside -- through a copy, through a pointer, and through
explicit unpacking of the struct.

So far, I find almost no difference whatsoever between the 3 modes,
which is encouraging.  If these benchmarks hold up, it means we can
choose the 'best' way to pass Py_buffer structs around without worry
of a speed hit.

Comments welcome.

reps are the number of repetitions per measurement, length is the
length of the array.  Each loop was run 3 times and the times are
reported below.

Using -O2 optimization, gcc 4.2.4.

copy passes a struct copy (i.e. func(struct_arg)), ptr passes a
reference (func(&struct_arg)) and nostruct is func(float *dta,
unsigned int length).

# On a Core 2 Duo, 3.16 GHz, 6 MB L2 Cache.
# reps: 10000 length: 10000
# copy: 0.280, 0.280, 0.290
# nostruct: 0.280, 0.290, 0.290
# ptr: 0.280, 0.280, 0.290

# reps: 100000 length: 1000
# copy: 0.280, 0.280, 0.290
# nostruct: 0.280, 0.290, 0.290
# ptr: 0.280, 0.290, 0.290

# reps: 1000000 length: 100
# copy: 0.280, 0.280, 0.300
# nostruct: 0.290, 0.290, 0.290
# ptr: 0.290, 0.290, 0.290

# reps: 10000000 length: 10
# copy: 0.140, 0.150, 0.150
# nostruct: 0.140, 0.140, 0.140
# ptr: 0.140, 0.140, 0.150

Kurt

Attachment: benchmarks.tgz
Description: GNU Zip compressed data

_______________________________________________
Cython-dev mailing list
[email protected]
http://codespeak.net/mailman/listinfo/cython-dev

Reply via email to