Kryz, your mail-severs seems to be down and when I checked out if you had a running WWW, I got this message:
"Chwilowo nic tu nie ma" [We are not here yet] So, back to the list ... This time corrected as per second mail: On Tue, 2008-08-26 at 11:12 +0100, Krzysztof Foltman wrote: > Jens M Andreasen wrote: > > I am doing some preliminary testing of CUDA for audio, Version 2 (final) > > has been out for a couple of days, and this is also what I am using. > > Does it require the proprietary drivers and/or Nvidia kernel module? > Yes, and not only that. The proprietary drivers distributed with say Mandrake, Ubuntu et al won't work either. Uninstall that, change your X setup to vesa (to stop recursive nvidia installer madness) and then get your CUDA driver and compiler from: http://www.nvidia.com/object/cuda_get.html > What kind of things is the gfx card processor potentially capable of > doing? Anything like multipoint interpolation for audio resampling > purposes? Multiple delay lines in parallel? Biquads? Multichannel > recording to VRAM? Multichannel recording by itself would be a waste of perfectly good floating point clock-cycles, but anything that you can map to a wide vector (64 to 196 elements) is up for grabs. A 196 voice multi timbral synthesizer perhaps or 64 channel-strips with basic filters and compressor/noise-gate for remix. The five muladds needed for a single biquad filter times the number of bands you need to equalize on fits the optimal programming model quite well. The linear 2D interpolator is also available and even cached. Perhaps not the worlds most useful toy for audio-resampling, but could find its way into some variation of wave-table synthesis. It can be set up to wrap around at the edges, which I find kind of interesting. Random access to main (device) memory is - generally speaking - a bitch and a no-go if you cannot wrap your head around ways to load and use very wide vectors. There are some 8096 fp registers to load into though, so all is not lost. Communication, permutation and exchange of data between vector elements OTOH is then fairly straight forward and cheap by means of a smallish shared memory on chip. The more you can make your algorithm(s) look like infinitely brain-dead parallel iterations of multiply/add, the better they will make use of the hardware. The way I see it, the overall feel of your strategy should be something like "The Marching Hammers" animation (from Pink Floyd: The Wall.) > Is it possible to confine all the audio stream transfer between gfx > and audio cards to kernel layer and only implement control in user > space? (to potentially reduce xruns, won't help for control latency > but at least it's some improvement) > Yuo you mean something like DMA? Yes I would have thought so but this is apparently not always the case, Especially not on this very card that I have here. :-/ The CUDA program running on the device will have priority over X though. So no blinking lights (nor printfs) before your calculation is done. For real-time work, I reckon this as a GoodFeature (tm)! Potentially this can also hang the system if you happen to implement the infinite loop (so don't do that ...) > Would it be possible to use a high level language like FAUST to > generate CUDA code? (by adding CUDA-specific backend) > The problem would be to give Faust a good understanding of the memory model and how to keep individual vector elements away from collectively falling over each other. But I must admit that I am not too familiar with what Faust is actually doing? Would it be of any help to you with a library of common higher level functionality like FFT and BLAS? ---8<--------------------------------------------- - "CUBLAS is an implementation of BLAS (Basic Linear Algebra Subprograms) on top of the NVIDIA(r) CUDA(tm) (compute unified device architecture) driver. It allows access to the computational resources of NVIDIA GPUs. The library is self-contained at the API level, that is, no direct interaction with the CUDA driver is necessary." ------8<................................ But observe that: ... - "Currently, only a subset of the CUBLAS core functions is implemented." /j > Krzysztof > _______________________________________________ Linux-audio-dev mailing list [email protected] http://lists.linuxaudio.org/mailman/listinfo/linux-audio-dev
