On Wed, Jan 12, 2011 at 2:44 AM, Moeller <[email protected]> wrote:

> On 11.01.2011 23:13, Andrew Hofmaier wrote:
> > I've begun to look into accelerating GNURadio applications with Nvidia
> CUDA GPU's
> > and have scanned through the archives of the discussion list.  I had two
> > questions on the topic:
> >
> > 1.  Is the CUDA-GNURadio port done by Martin DvH circa 2008 still
> > available and runnable?  All links I've seen are broken.
>
> Is CUDA really suitable? There is a certain overhead in data
> communications.
> CUDA is only useful, if it can compute complex things without
> communicating.
> But a data streaming application needs lots of I/O.
> The CPU with SSE is also very fast in things like FFT.
> I made some experiments with CUDA, but they were not very successful,
> far below the peak FLOPS you get in benchmarks.
> But I'm not an experienced programmer ...
>
> > 2.  Much of the results I've seen, both here and elsewhere, suggest that
> > CUDA is not typically applicable to general GNURadio applications.  It
> > has worked in specific cases, but only where the data throughput
> > requirements are very high and the algorithms are extremely
>
> Yes, I had the same experiences. I tried to let CUDA do the one-dimensional
> FFT.
> It was slower than on CPU, had a large communication overhead.
> Maybe better with larger FFT sizes, or with 2D FFT, or better programming
> ...
> In contrast, the sample programs were very fast, but also very special
> like Fractals computing, Image processing or particle physics.
>
> > these cards for GNURadio applications?  Some of the major relevant
> > improvements are the ability to concurrently schedule multiple kernels
> > and asynchronously perform memory transfers.
>
> I think important is that the kernels have to compute very much, compared
> to data transmission tasks. 1D FFT is not very computing-intensive, related
> to
> data shifting. What kind of algorithm do you want to port to CUDA?
>
>
> _______________________________________________
> Discuss-gnuradio mailing list
> [email protected]
> http://lists.gnu.org/mailman/listinfo/discuss-gnuradio
>


I've done some work with both CUDA and GNURadio, and I think there's
definitely some potential there for using them jointly, but only for certain
applications, and only if the software is architected intelligently.

GPUs are incredibly powerful, with 1+TFLOP operation and 100+GB/s memory
speeds within the GPU. I've used GPUs to perform real-time signal processing
on 300+MHz of continuously-streaming data, without dropping a sample. But
the PCI bus bandwidth of ~5GB/s can sometimes be a real bottleneck, so you
have to design accordingly.

You DON'T want to try to make individual drop-in CUDA replacements for
multiple GNURadio processing blocks in a chain. It doesn't make any sense to
send data to the GPU, perform an operation (eg filtering), bring the result
back to the host, send some more data to the GPU, perform a 2nd operation,
bring the data back, etc. The PCI transfers will eat you alive. The key is
to send large chunks (10s or 100s of MBs) of data to the GPU, and do as much
computation as possible while there. Large batched ffts, wideband frequency
searches, channelizing, it's all gravy. It's great if you can stream
wideband data to the GPU, have it do some computationally intensive stuff,
perform a rate reduction, then stream the lower bandwidth data back to the
host to do further (annoyingly serial) operations. You could even (if you
wanted to) implement an entire transmitter or receiver within the GPU, with
the CPU solely shuttling data to or from the ADC/DAC.

In summary, yes please do get excited about CUDA/OpenCL -- it's great
technology. When the USRP 9.0 comes out with a gigasample ADC/DAC, GPUs are
there ready to do the heavy lifting :)

-Steven
_______________________________________________
Discuss-gnuradio mailing list
[email protected]
http://lists.gnu.org/mailman/listinfo/discuss-gnuradio

Reply via email to