El Jueves, 14 de Septiembre de 2006 23:43, George Woltman escribió: > >Tony Reix tried to reply but was blocked. His reply follows: > > > > > >Lessons: > >a) Chunks of Memory data should be allocated by the threads > >that use it most of the time, and not as one big array. > >b) The single thread bottleneck must be reduced or ... //ed. > > > >I know very few about programming a FFT. But I guess they have > >been built the easy way: you have one big array allocated at > >beginning. //ing this FFT with threads means that each threads > >has to work on a part of the array. Here is the problem. > >I do not know if it is possible, but one should try to > >allocate diffent chunks of memory, each allocated by the > >thread which uses it. When some work must be done on all the > >chunks of data seen as one big array, one could consider > >copying data in another big array or //ing this task (easier > >to say that than to do it, I know nearly nothing about FFT !). > > Tony, the only solution is to copy chunks of main array to local subarrays for every thread in the FFT pass where the separation is small. Then copy back to the main array again. It is not possible in last IFFT, IDWT, Carry and fisrt FFT. In some moment a thread has to process data processed by other thread.
Assume 64 chunks of memory, 4 threads, radix 4, here the digit is the #thread. In first pass we have a memory access as 00112233001122330011223300112233 in the second 00000000111111112222222233333333 in this second pass is possible a copy of main array. I'm not sure what are costs and benefits. But is a nice idea. Guillermo -- Guillermo Ballester Valor [EMAIL PROTECTED] Ogijares, Granada SPAIN Public GPG KEY http://www.oxixares.com/~gbv/pubgpg.html _______________________________________________ Prime mailing list [email protected] http://hogranch.com/mailman/listinfo/prime
