If you are looking for an example of IIR filter implementation, it is
actually possible and under some restrictions even if the benefit arises
only for a large order (which is rarely used in audio).
The main problem is that there is no instruction AFAIK that can sum up all
the float in the 4 slots, you must do it by shuffling, which sometimes
invalidates the benefit.
You may look for this 2 papers about IIR filtering with SIMD and get some
ideas:
http://saluc.engr.uconn.edu/refs/processors/intel/mmx_sse/iir_fir.pdf (the
classic Intel application note AP598)
http://www.cosy.sbg.ac.at/~rkutil/publication/Kutil08b.pdf
Have fun!
M.
-Messaggio originale-
Da: music-dsp-boun...@music.columbia.edu [mailto:music-dsp-
boun...@music.columbia.edu] Per conto di Peter S
Inviato: mercoledì 15 aprile 2015 15:00
A: A discussion list for music-related DSP
Oggetto: Re: [music-dsp] recursive SIMD?
On 11/04/2015, Eric Christiansen eric8939...@gmail.com wrote:
I haven't done much with SIMD in the past, so my experience is pretty
low, but my understanding is that each data piece must be defined
prior to the operation, correct? Meaning that you can't use result of
the operation of one piece of data as the source data for the next
operation, right?
This came up in thinking about how to optimize an anti-aliasing routine.
If, for example, the process is oversampling by 4 and running each
through a low pass filter and then averaging the results, I was
wondering if there's some way of using some SIMD process to speed this
up, specifically the part sending each sample through the filter.
Since each piece has to go through sequentially, I would need to use
the result of the first filter tick as the input for the second filter
tick.
But that's not possible, right?
Technically you can do it, by keeping the previous result in some
temporary
register, but since you cannot parallelize the recursion (unless you have
actual parallel filters), it rarely gives much speedup for IIR filters, if
any. So it's
probably not worth the hassle for recursive filters.
Quote from a post from 2007:
- Original Message -
From: Eric Brombaugh ebrombaugh at earthlink.net
To: A discussion list for music-related DSP music-dsp at
music.columbia.edu
Sent: Friday, October 05, 2007 11:38 PM
Subject: Re: [music-dsp] Cascaded biquad filter structures
You can vectorize a cascade of biquads if you're willing to accept
some transport delay - just insert pipelines between the stages to
hold the previous results.
A few years back I coded up an FIR and an IIR biquad in SSE. I got
about 2x speed improvement in the FIR version over plain optimized GCC
with floats, but the IIR implementation was about 70% slower in SSE
than plain optimized GCC.
--
dupswapdrop -- the music-dsp mailing list and website:
subscription info, FAQ, source code archive, list archive, book reviews,
dsp
links http://music.columbia.edu/cmc/music-dsp
http://music.columbia.edu/mailman/listinfo/music-dsp
--
dupswapdrop -- the music-dsp mailing list and website:
subscription info, FAQ, source code archive, list archive, book reviews, dsp
links
http://music.columbia.edu/cmc/music-dsp
http://music.columbia.edu/mailman/listinfo/music-dsp