Hi everyone, Awhile ago, I posted a few messages about writing some fast Fourier transform (FFT) code for Boost. Now, we have a lot of discussion about numerical integration. If we're going to start writing numerical code for Boost (and I hope do a lot of it), we should probably come up with some overall guidelines about their interfaces & performance. Here are some of my thoughts on those issues; maybe this could start a useful discussion.
Much like STL algorithms, numerical functions can take an array of data and process it and return either a single number or another array (integration). Also, numerical code can take a function object and return an array (derivatives). This gets more complicated, for example, a Fourier transform (FT) takes an array of either real or complex data and, in general, returns complex result. With the above examples in mind, there are a lot of issues to think about. Should algorithms that take an array of data simply take some container as an argument and return as a container (if appropriate), or should they take input iterators, an output iterator, and return an output iterator (ala STL)? An STL-style algorithm would work just as well for C arrays as for any standard container. If the algorithm takes a std::valarray<T> (not a standard container), for example, then the programmer who writes the algorithm can take full advantage of the natural math operations defined for that class. After all, why should we even allow a user to integrate a std::list of data? That's not the intended use of the list class. However, that is THE one and only intended use of the valarray class, and providing an iterator interface doesn't allow numerical algorithms to operate on C++'s only numeric array class because valarrays don't have iterators. Then again, maybe someone will want to use a boost::array as often as a plain C array as often as their own array class; then that user is forced to use the valarray class when it doesn't necessarily fit his needs. Say that we want to stick to the STL convention and go with an iterator interface. Now, let's try applying it to a FT function. We might want the following types of FT functions: one that takes real data & returns complex one that takes complex & returns complex one that's entirely in-place Here are the difficulties that arise: Life is easy when the FFT takes complex & returns complex: template<class InIt, class OutIt> OutIt fft(InIt first, InIt last, OutIt x); If it takes real data & returns complex, how would that interface even work? How do we guarantee that InIt points to real data & OutIt points to complex? Maybe some overloading could be done, but there's no partial specialization for function templates, only for class templates. Now let's think about an in-place FFT: template<class InIt, class OutIt> void fft(InIt first, InIt last); The in-place transform must take complex data & return complex data because InIt is the type of the arguments & return value. There's no possibility of transforming real data in-place because in general the result is complex. To transform real data this way, a user would have to loop through the data & convert to complex. It gets worse because performing a FFT of real input only requires a complex FFT of size n/2 (n is the size of the input data) because of a property of FFTs. That's a big performance gain that a user would severely miss. Clearly, performance considerations do make a big impact the interface. To add another FFT example, the first part of the FFT algorithm requires re-ordering the data. The new order of the data only depends on the number of data points to be transformed. If we could make FFTs of different sizes separate functions, then inside each FFT function, one could store the new order of the indices in a static array object. If the size is a template parameter, that happens with little effort, and we can make a big performance gain by eliminating redundant calculations of the indices. That way if a user is doing 1,000 FFTs of size 1024, the order of the indices only has to be computed once, so each FFT evaluation does only a portion of the algorithm. Requiring the user to input the size at compile time is a bit of a restriction, plus there are many difficulties that arise when we try to build other algorithms on top of our complex FFT (real FFT, power spectrum, correlation, convolution). To alleviate the restriction & other difficulties, we could make the FFT a class, & require the user to pass the size of the data into the FFT constructor, then the order of the indices happens only once, in the constructor. Just to add one final thought, comparing numerical algorithms to STL algorithms again, do we want to make performance guarantees (algorithmic complexity) just like the STL? That would be very nice. Keep in mind that a function's overall performance depends greatly on the containers used, not just the complexity of the algorithm. Thus, the "valarray vs. iterator vs. anything else" interface issue will have a big impact on performance. I think these are issues that need to be weighed, and a policy should be set forth before Boost starts to accept numerical algorithms. That would make it much easier for developers of numerical algorithms for Boost. Jason Schmidt ________________________________________________________________ Sign Up for Juno Platinum Internet Access Today Only $9.95 per month! Visit www.juno.com _______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost