Sounds like an interesting problem, and it also sounds like something  
that Cython could help a lot with. However, it'd be much easier to  
help you out if you posted a 10-20 line snippet of code that is your  
bottleneck and that you are trying to make faster.

- Robert

On Sep 21, 2009, at 12:11 PM, Jean-Francois Moulin wrote:

> Yes, a single value of the array is incremented at each iteration of a
> many many times loop (I 'll describe it below)
> My aim was as you point out to have the main content of this loop be
> optimized and tried also to have the fastest array ops as possible  
> (why
> not after all).
> Later on, the full numpy potentialities are used to analyse the  
> array as
> a series of 2D images (filtering, extrema search, plotting,  
> slicing ...).
>
>
> My idea to go to lists was to have a type which Cython directly
> understands (or am I wrong there), the array.array questions goes  
> in the
> same direction too btw.
>
> But ok... I got the point. No use to move away from numpy.
>
> Now, since you asked, my problem is the following I am working on a  
> data
> analysis script to deal with time of flight (TOF) detection of  
> neutrons
> with a 2d detector. Basically I need to analyse a huge series of  
> events
> consisting of timing signals. Each time a neutron hits my detector I
> receive 5 signals, one of which tells me the arrival time the 4 others
> together, where did the neutron hit the detector.
> If you get bored here, skip the next paragraph!
>
>  4 signals for 2 d location means  redundancy, which is more than  
> useful
> to desentangle envents which happen very close in time (and for which
> the 5 signals might be observed in a mixed order...)
> So in  a loop I detect a clock signal, understand that a neutron just
> arrived, I now look for all signals over the other four channels which
> might be compatible with this clock signal (i.e. signals observed  
> within
> a fixed time delay) . I they are only four bingo I can locate the  
> event,
> if I found more signals, trouble begins and I must start to look into
> timing details... Sometimes the electronics also misses events and I
> should try to reconstruct them as much as possible on the basis of the
> remaining info (thanks again to redundancy)
>
> So basically a lookup into a large file and in the end increment the
> proper pixels in a series of 2d maps (1 map for each tof slice which I
> consider according to my time resolution). Typically between 20 and  
> 200
> maps of ca 350*350 pixels. Number of events to consider somewhere
> between 100.000 and some millions. Raw data files between 50Mb (a joke
> or a bad sample) and 5Gb (an overkill measurement).
>
> One of the things I left to python is the histogramming itself,  
> which I
> realise using bisect (I read this is already efficient C code, and  
> I am
> inclined to believe it when I compare the speed to the first naive DIY
> attempts I made just for fun ;0)
>
> For the rest, Cython is helping a good deal with lists/tuple lookup  
> and
> the simple arithmetics involved in  the pixel position calculation. I
> just wanted to push it as far as possible. I also happen to use  
> scipy on
> an everyday basis and I was curious to see it together with Cython
> (maybe for other tasks)
>
> So far...
>
> Thanks a lot for your comments
>
> JF
>
>
> Christopher Barker wrote:
>> Jean-Francois Moulin wrote:
>>> What I have is a big 3d L array for which I need to increment a  
>>> single
>>> element by one at each call
>>
>> Something like this?
>>
>>
>> def MyFun(arr, i, j, k):
>>      arr[i,j,k] += 1
>>
>>
>> If so, the you might as well keep it in python.
>>
>> That little operation must be inside a some loop of some sort. The  
>> goal
>> is to move the whole loop into Cython. Then you can pass in the  
>> array,
>> and cython can convert your code into nice, fast C-type indexing.
>>
>>
>>> So, for now I
>>> can separate more sophisticated operations and perform them later  
>>> once
>>> the array is finalized... I can thus build my array as a huge  
>>> list of
>>> lists
>>
>> Why do you need to build it as lists? Usually you only need to do  
>> this
>> if you don't know how big it's going to be when you start.
>>
>>> Or is
>>> it worth (possible?) to use the array.array object of python.
>>
>> I don't think that buys you anything over numpy arrays, and you  
>> lose a lot!
>>
>> Perhaps a bit more description of your problem, and/or a stripped  
>> down
>> pure-python version for us to comment on. I suspect that you could
>> improve your performance a lot just by using numpy optimally (which
>> means a post to the numpy list may be in order)
>>
>> NOTE: when faced with issues like this, you'll get the best  
>> results from
>> a list if you post your problem, rather than a solution you are  
>> trying
>> -- If can can pose your problem succinctly and clearly, the odds are 
>> good someone may have a better solution that the one you've come  
>> up with!
>>
>>
>> -Chris
>>
>>
>>
>>
>
> _______________________________________________
> Cython-dev mailing list
> [email protected]
> http://codespeak.net/mailman/listinfo/cython-dev

_______________________________________________
Cython-dev mailing list
[email protected]
http://codespeak.net/mailman/listinfo/cython-dev

Reply via email to