Sounds like an interesting problem, and it also sounds like something that Cython could help a lot with. However, it'd be much easier to help you out if you posted a 10-20 line snippet of code that is your bottleneck and that you are trying to make faster.
- Robert On Sep 21, 2009, at 12:11 PM, Jean-Francois Moulin wrote: > Yes, a single value of the array is incremented at each iteration of a > many many times loop (I 'll describe it below) > My aim was as you point out to have the main content of this loop be > optimized and tried also to have the fastest array ops as possible > (why > not after all). > Later on, the full numpy potentialities are used to analyse the > array as > a series of 2D images (filtering, extrema search, plotting, > slicing ...). > > > My idea to go to lists was to have a type which Cython directly > understands (or am I wrong there), the array.array questions goes > in the > same direction too btw. > > But ok... I got the point. No use to move away from numpy. > > Now, since you asked, my problem is the following I am working on a > data > analysis script to deal with time of flight (TOF) detection of > neutrons > with a 2d detector. Basically I need to analyse a huge series of > events > consisting of timing signals. Each time a neutron hits my detector I > receive 5 signals, one of which tells me the arrival time the 4 others > together, where did the neutron hit the detector. > If you get bored here, skip the next paragraph! > > 4 signals for 2 d location means redundancy, which is more than > useful > to desentangle envents which happen very close in time (and for which > the 5 signals might be observed in a mixed order...) > So in a loop I detect a clock signal, understand that a neutron just > arrived, I now look for all signals over the other four channels which > might be compatible with this clock signal (i.e. signals observed > within > a fixed time delay) . I they are only four bingo I can locate the > event, > if I found more signals, trouble begins and I must start to look into > timing details... Sometimes the electronics also misses events and I > should try to reconstruct them as much as possible on the basis of the > remaining info (thanks again to redundancy) > > So basically a lookup into a large file and in the end increment the > proper pixels in a series of 2d maps (1 map for each tof slice which I > consider according to my time resolution). Typically between 20 and > 200 > maps of ca 350*350 pixels. Number of events to consider somewhere > between 100.000 and some millions. Raw data files between 50Mb (a joke > or a bad sample) and 5Gb (an overkill measurement). > > One of the things I left to python is the histogramming itself, > which I > realise using bisect (I read this is already efficient C code, and > I am > inclined to believe it when I compare the speed to the first naive DIY > attempts I made just for fun ;0) > > For the rest, Cython is helping a good deal with lists/tuple lookup > and > the simple arithmetics involved in the pixel position calculation. I > just wanted to push it as far as possible. I also happen to use > scipy on > an everyday basis and I was curious to see it together with Cython > (maybe for other tasks) > > So far... > > Thanks a lot for your comments > > JF > > > Christopher Barker wrote: >> Jean-Francois Moulin wrote: >>> What I have is a big 3d L array for which I need to increment a >>> single >>> element by one at each call >> >> Something like this? >> >> >> def MyFun(arr, i, j, k): >> arr[i,j,k] += 1 >> >> >> If so, the you might as well keep it in python. >> >> That little operation must be inside a some loop of some sort. The >> goal >> is to move the whole loop into Cython. Then you can pass in the >> array, >> and cython can convert your code into nice, fast C-type indexing. >> >> >>> So, for now I >>> can separate more sophisticated operations and perform them later >>> once >>> the array is finalized... I can thus build my array as a huge >>> list of >>> lists >> >> Why do you need to build it as lists? Usually you only need to do >> this >> if you don't know how big it's going to be when you start. >> >>> Or is >>> it worth (possible?) to use the array.array object of python. >> >> I don't think that buys you anything over numpy arrays, and you >> lose a lot! >> >> Perhaps a bit more description of your problem, and/or a stripped >> down >> pure-python version for us to comment on. I suspect that you could >> improve your performance a lot just by using numpy optimally (which >> means a post to the numpy list may be in order) >> >> NOTE: when faced with issues like this, you'll get the best >> results from >> a list if you post your problem, rather than a solution you are >> trying >> -- If can can pose your problem succinctly and clearly, the odds are >> good someone may have a better solution that the one you've come >> up with! >> >> >> -Chris >> >> >> >> > > _______________________________________________ > Cython-dev mailing list > [email protected] > http://codespeak.net/mailman/listinfo/cython-dev _______________________________________________ Cython-dev mailing list [email protected] http://codespeak.net/mailman/listinfo/cython-dev
