Fernando Perez wrote: > On 7/18/06, Tim Hochberg <[EMAIL PROTECTED]> wrote: >> Eric Emsellem wrote: >> > thanks for the tips. (indeed your "add.reduce" is correct: I just >> wrote >> > this down too quickly, in the script I have a "sum" included). >> > >> > And yes you are right for the memory issue, so I may just keep the >> loop >> > in and try to make it work on a fast PC...(or use parallel processes) >> > >> > (is "sum" different than "add.reduce"?) >> > >> > thanks again to both Bill Baxter and Perry Greenfield for their fast >> > (and helpful!) answers. >> > >> I just wanted to add that there are faster, but considerably complicated >> ways to attack this class of problems. The one I've looked at in the >> past was the fast multipole method and I believe there are others. I'm >> not sure whether these can be implemented efficiently in numpy, but you >> may want to take a look into this kind of more sophisticated/complicated >> approach if brute forcing the calculation doesn't work. > > Indeed, FMM is the best known method that can turn this O(n^2) problem > into O(n*log(n)). As Tim indicates, there is no easy way out of this > one. Incidentally, there is a talk about FMM on the scipy'06 > schedule, in case you are going to attend. > > An alternative approach to FMM (conceptually similar in some ways) is > described here: > > http://amath.colorado.edu/faculty/fperez/talks/0604_sanum_mwadap.pdf > > Unfortunately this isn't exactly a half-hour optimization effort, as > the required machinery is pretty heavy duty. And yes, this has all > been done in python and runs on current numpy/scipy (though it has > Fortran, C and C++ sprinkled as needed). I'm interested in hearing in what situations people end up dropping into C/C++ or Fortran. It would be interesting to see if general solutions could be found for making some of them fast enough at the Python level. Here are some situations that I've run into where numpy has trouble computing results efficiently.
1. The value at a given point is the computed one of two (or more) ways depending on some condition. The subcomputations are expensive, so one does not want to compute them both and then chose a result using 'where'. I think I have a decent way to attack this at the Python level using argsort: basically, sort all of the operands, find the boundaries between the regions where different computations are done, compute the expensive subcomputations on the appropriate blocks and then unsort the data. It would be interesting to see if this could be automated in a clean way, perhaps within numexpr. 2. Generalized reduce. We want to reduce an array using an arbitrary function f(x, y) that is not a standard ufunc. In the particular case I've run into, x and y would be vectors or perhaps record arrays, and the functions would couple the various records when producing the output. I don't see how this case can be handled efficiently at the Python level, short of using something like Weave to produce a custom ufunc and calling reduce on it. I don't see how something like numexpr could handle it, unfortunately, since it gains it's speed by operating on blocks of data. I'm sure there are other cases that I've run into, but that's all I'm coming up with right now. -tim ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ Numpy-discussion mailing list Numpy-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion