Jean-Francois Moulin wrote:
> Hi,
> 
> I am working on a script which reads rather large amounts of data in a
> binary format and then
> processes it through different test functions.
> I optimized the beast as much as I possibly could: using tuples
> instead of lists,
> then moving to cython and declaring the types, optimizing the calls to numpy 
> fn
> by use of the  buffer notation...
> 
> All in all I gain a factor 10 in speed. Not bad but still not really enough...
> 
> What I still see as factors slowing me down could be (see my code in attach):
> - the use of the file.read() function from python to get a string
> which I then process (is an fread call
> from c faster... how to implement it?)

The real problem is that you read 4 bytes at the time. If you buffer up 
longer stretches somehow it doesn't matter so much which call you use. I.e.:

obj = file.read(400)
cdef char* buf = obj
# hold on to obj, but process buf[0]..buf[399]
buf = NULL
obj = None # do not do this until you no longer use buf

Though if you have a socket rather than a file I suppose you're worse off.

You can use C file handling diretly (the safest thing is to open and 
close the file/socket with C calls as well), just look up Cython 
examples on interfacing with C code and Google for C and file handling.

> - the use of the struct.unpack

As long as you stick to native-endian, you should be able to just cast 
to an int in your case:

cdef char* buf = data
cdef int* buf_as_int = <int*>buf
cdef int value = *buf_as_int

If you need to access more than one int, you can use a struct instead.

> - the bit masking technique I use... (is it good or bad)

For speed it is very fast -- if it has the effect you want there's not 
going to be any faster way.

Consider writing it like this though:

bit30 = data & (1 << 30) != 0

But it is just about readability. (It will be compiled to the same thing.)

-- 
Dag Sverre
_______________________________________________
Cython-dev mailing list
[email protected]
http://codespeak.net/mailman/listinfo/cython-dev

Reply via email to