Still struggling with the stereo layout. I stated earlier that this worked well (for an in place routine):
memcpy(outR, inR, vectorsize * sizeof(t_float)); inR = outR; The output sounded all right. But upon closer inspection, it turned out that all the stereo sound came from the right input and the left input did nothing. It seems that the two signal buffers A and B for a two-in two-out object are assigned like so: inL: buffer A inR: buffer B outL: buffer B outR: buffer A So it's safe to process inL as a block but write it to outR, and process inR while writing to outL. Then the output blocks are ok, but they come from the wrong outlets. I swap these within the object. It is less operations than copying left and right inputs to outputs, but still a waste. Katja On Sun, Jan 13, 2013 at 3:49 AM, Hans-Christoph Steiner <[email protected]> wrote: > > Yeah, that makes sense. With all the auto-vectorization and SIMD support is > recent versions of gcc, it seems a better approach is to tailor the C code to > work well with SIMD-aware compilers. > > .hc > > On 01/12/2013 04:45 PM, katja wrote: >> It's interesting, but rather compiler-and-processor-specific. Such >> code is maintanance-intensive. At the moment, ARM processors are >> screaming loudest for optimization. Best thing for a community project >> is probably plain C code which reckons with parallel processing, >> because that won't go away for the next few decades. Functions like >> copy_perform8(), times_perform8() etc. can profit from SIMD >> instructions without a need for compiler intrinsics and asm code. >> Well-structured data storage and access can make a 50 % or more >> performance gain, in my experience. >> >> Another important thing: avoid float precision conversions. Throughout >> Pd there are many untyped float defines and literal constants which >> default to double, and I have introduced more when making libs >> double-ready. Not good. I'll come back to this in another thread. >> >> Katja >> >> >> On Sat, Jan 12, 2013 at 8:14 PM, Hans-Christoph Steiner <[email protected]> >> wrote: >>> >>> If you are interested, there is still the hand-coded SIMD stuff from >>> pd-devel: >>> https://pure-data.svn.sourceforge.net/svnroot/pure-data/branches/pd-devel/v0-39 >>> >>> .hc >>> >>> On 01/12/2013 09:34 AM, katja wrote: >>>> Function copy_perform8() is also eligible for SIMD processing. I used >>>> memcpy() because it is straightforward to use, while Pd's functions >>>> pointed to the wrong locations for this case. On the reverb's total >>>> load there is no significant performance difference. >>>> >>>> Katja >>>> >>>> >>>> On Sat, Jan 12, 2013 at 1:00 AM, Hans-Christoph Steiner <[email protected]> >>>> wrote: >>>>> >>>>> I recently learned that libc's memcpy actually uses things like SSE2 or >>>>> SSSE2 >>>>> so it can be quite fast on CPUs from the past 10 years, especially of the >>>>> last >>>>> 5 years. >>>>> >>>>> It would be worth profiling to see if that's noticeable. >>>>> >>>>> .hc >>>>> >>>>> On 01/11/2013 05:12 PM, katja wrote: >>>>>> Ok so I did the ugly thing with the right channel input and output >>>>>> pointers: >>>>>> >>>>>> memcpy(outR, inR, vectorsize * sizeof(t_float)); >>>>>> inR = outR; >>>>>> >>>>>> Works like a charm, thanks again. >>>>>> >>>>>> Katja >>>>>> >>>>>> >>>>>> >>>>>> On Fri, Jan 11, 2013 at 10:05 PM, Miller Puckette <[email protected]> wrote: >>>>>>> copy_perform assumes the data is 4-byte aligned so might save a test >>>>>>> or two compared to memcopy() - but I really don't know. I never >>>>>>> benchmarked the two against each other :) >>>>>>> >>>>>>> M >>>>>>> >>>>>>> On Fri, Jan 11, 2013 at 09:36:41PM +0100, katja wrote: >>>>>>>> Hi Miller, >>>>>>>> >>>>>>>> Thanks for the solution. The routines are in place so copying the >>>>>>>> right channel input to output should do it. Is there any reason to >>>>>>>> prefer copy_perform() over memcpy()? I'm trying to make the most >>>>>>>> efficient reverb for RPi & Co. >>>>>>>> >>>>>>>> Katja >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Fri, Jan 11, 2013 at 7:57 PM, Miller Puckette <[email protected]> wrote: >>>>>>>>> Hi Katja - >>>>>>>>> >>>>>>>>> There's one example of this in sigfft_dspx() - a complex FFT that >>>>>>>>> 'natively' >>>>>>>>> works on 2 signals in-place but has to deal with various cases in >>>>>>>>> which >>>>>>>>> buffers get re-used. It's ugly but the basic idea is first to get the >>>>>>>>> inputs copied to the outputs (unless they're already there in the >>>>>>>>> correct >>>>>>>>> order in which case nothing needs to be done) and then run the >>>>>>>>> in-place >>>>>>>>> algorithm. >>>>>>>>> >>>>>>>>> If the algo only works out-of-place (i.e. you need 4 distinct >>>>>>>>> buffers, 2 >>>>>>>>> in and 2 out) the only way out is (at least conditionally) allocate >>>>>>>>> temporary >>>>>>>>> copies of the inputs before writing to any outputs. >>>>>>>>> >>>>>>>>> I may be able to add an optional way tilde objects can request that >>>>>>>>> output >>>>>>>>> buffers be distinct from input ones sometime in the future - but this >>>>>>>>> is a >>>>>>>>> couple of steps away for me right now :) >>>>>>>>> >>>>>>>>> M >>>>>>>>> >>>>>>>>> On Fri, Jan 11, 2013 at 03:32:09PM +0100, katja wrote: >>>>>>>>>> Hello, >>>>>>>>>> >>>>>>>>>> I'm working on a Pd class with stereo channels (reverb), and the >>>>>>>>>> routine happens to be most efficient when iterating over the samples >>>>>>>>>> per channel, instead of left and right together in the perform loop. >>>>>>>>>> However, when doing two while loops in one object, one for left and >>>>>>>>>> one for right, the right channel samples get overwritten because of >>>>>>>>>> sample-wise in-place computation. Is this an inescapable truth? I >>>>>>>>>> mean, I could write a left channel class and a right channel class >>>>>>>>>> (actually did that to verify that it works), but it's inconvenient to >>>>>>>>>> use. What could be an efficient way to get them in one object? >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Katja >>>>>>>>>> >>>>>>>>>> _______________________________________________ >>>>>>>>>> [email protected] mailing list >>>>>>>>>> UNSUBSCRIBE and account-management -> >>>>>>>>>> http://lists.puredata.info/listinfo/pd-list >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> [email protected] mailing list >>>>>>>> UNSUBSCRIBE and account-management -> >>>>>>>> http://lists.puredata.info/listinfo/pd-list >>>>>> >>>>>> _______________________________________________ >>>>>> [email protected] mailing list >>>>>> UNSUBSCRIBE and account-management -> >>>>>> http://lists.puredata.info/listinfo/pd-list >>>>>> >>>>> >>>>> _______________________________________________ >>>>> [email protected] mailing list >>>>> UNSUBSCRIBE and account-management -> >>>>> http://lists.puredata.info/listinfo/pd-list _______________________________________________ [email protected] mailing list UNSUBSCRIBE and account-management -> http://lists.puredata.info/listinfo/pd-list
