Still struggling with the stereo layout. I stated earlier that this
worked well (for an in place routine):
memcpy(outR, inR, vectorsize * sizeof(t_float));
inR = outR;
The output sounded all right. But upon closer inspection, it turned
out that all the stereo sound came from the right input and
Function copy_perform8() is also eligible for SIMD processing. I used
memcpy() because it is straightforward to use, while Pd's functions
pointed to the wrong locations for this case. On the reverb's total
load there is no significant performance difference.
Katja
On Sat, Jan 12, 2013 at 1:00
If you are interested, there is still the hand-coded SIMD stuff from pd-devel:
https://pure-data.svn.sourceforge.net/svnroot/pure-data/branches/pd-devel/v0-39
.hc
On 01/12/2013 09:34 AM, katja wrote:
Function copy_perform8() is also eligible for SIMD processing. I used
memcpy() because it is
It's interesting, but rather compiler-and-processor-specific. Such
code is maintanance-intensive. At the moment, ARM processors are
screaming loudest for optimization. Best thing for a community project
is probably plain C code which reckons with parallel processing,
because that won't go away for
Yeah, that makes sense. With all the auto-vectorization and SIMD support is
recent versions of gcc, it seems a better approach is to tailor the C code to
work well with SIMD-aware compilers.
.hc
On 01/12/2013 04:45 PM, katja wrote:
It's interesting, but rather compiler-and-processor-specific.
Hello,
I'm working on a Pd class with stereo channels (reverb), and the
routine happens to be most efficient when iterating over the samples
per channel, instead of left and right together in the perform loop.
However, when doing two while loops in one object, one for left and
one for right, the
Hi Katja -
There's one example of this in sigfft_dspx() - a complex FFT that 'natively'
works on 2 signals in-place but has to deal with various cases in which
buffers get re-used. It's ugly but the basic idea is first to get the
inputs copied to the outputs (unless they're already there in the
Hi Miller,
Thanks for the solution. The routines are in place so copying the
right channel input to output should do it. Is there any reason to
prefer copy_perform() over memcpy()? I'm trying to make the most
efficient reverb for RPi Co.
Katja
On Fri, Jan 11, 2013 at 7:57 PM, Miller Puckette
copy_perform assumes the data is 4-byte aligned so might save a test
or two compared to memcopy() - but I really don't know. I never
benchmarked the two against each other :)
M
On Fri, Jan 11, 2013 at 09:36:41PM +0100, katja wrote:
Hi Miller,
Thanks for the solution. The routines are in
Ok so I did the ugly thing with the right channel input and output pointers:
memcpy(outR, inR, vectorsize * sizeof(t_float));
inR = outR;
Works like a charm, thanks again.
Katja
On Fri, Jan 11, 2013 at 10:05 PM, Miller Puckette m...@ucsd.edu wrote:
copy_perform assumes the data is 4-byte
I recently learned that libc's memcpy actually uses things like SSE2 or SSSE2
so it can be quite fast on CPUs from the past 10 years, especially of the last
5 years.
It would be worth profiling to see if that's noticeable.
.hc
On 01/11/2013 05:12 PM, katja wrote:
Ok so I did the ugly thing
11 matches
Mail list logo