Hi All, It was suggested off list that I gather some simple benchmarks comparing the different ways of passing a struct with a contiguous array pointer inside -- through a copy, through a pointer, and through explicit unpacking of the struct.
So far, I find almost no difference whatsoever between the 3 modes, which is encouraging. If these benchmarks hold up, it means we can choose the 'best' way to pass Py_buffer structs around without worry of a speed hit. Comments welcome. reps are the number of repetitions per measurement, length is the length of the array. Each loop was run 3 times and the times are reported below. Using -O2 optimization, gcc 4.2.4. copy passes a struct copy (i.e. func(struct_arg)), ptr passes a reference (func(&struct_arg)) and nostruct is func(float *dta, unsigned int length). # On a Core 2 Duo, 3.16 GHz, 6 MB L2 Cache. # reps: 10000 length: 10000 # copy: 0.280, 0.280, 0.290 # nostruct: 0.280, 0.290, 0.290 # ptr: 0.280, 0.280, 0.290 # reps: 100000 length: 1000 # copy: 0.280, 0.280, 0.290 # nostruct: 0.280, 0.290, 0.290 # ptr: 0.280, 0.290, 0.290 # reps: 1000000 length: 100 # copy: 0.280, 0.280, 0.300 # nostruct: 0.290, 0.290, 0.290 # ptr: 0.290, 0.290, 0.290 # reps: 10000000 length: 10 # copy: 0.140, 0.150, 0.150 # nostruct: 0.140, 0.140, 0.140 # ptr: 0.140, 0.140, 0.150 Kurt
benchmarks.tgz
Description: GNU Zip compressed data
_______________________________________________ Cython-dev mailing list [email protected] http://codespeak.net/mailman/listinfo/cython-dev
