[julia-users] Cross-correlation: rfft() VS fft() VS xcorr() performances

CrocoDuck O'Ducks Thu, 07 Apr 2016 13:37:36 -0700

Hey there cool people!

Seems like this topic will be pretty similar to this one 
<https://groups.google.com/forum/#!topic/julia-users/JsI2i80f-uU>.


So, I am trying to write my version of alignsignals 
<http://uk.mathworks.com/help/signal/ref/alignsignals.html?requestedDomain=www.mathworks.com>.
 
In few words, I calculate the cross-correlation, find the peak, calculate 
its shift from the origin of time, use it to align the signals. I got three 
version of a function to achieve this, using xcorr, fft and rfft 
respectively. When using two signals one minute long, sampled at 48 kHz, 
they benchmark <https://github.com/johnmyleswhite/Benchmarks.jl> as follows:

xcorr() version:

================ Benchmark Results ========================
     Time per evaluation: 832.16 ms [663.40 ms, 1.00 s]
Proportion of time in GC: 15.94% [14.32%, 17.57%]
        Memory allocated: 460.01 mb
   Number of allocations: 239 allocations
       Number of samples: 10
   Number of evaluations: 10
 Time spent benchmarking: 9.73 s

rfft() version:

================ Benchmark Results ========================
     Time per evaluation: 1.01 s [654.63 ms, 1.36 s]
Proportion of time in GC: 12.11% [8.30%, 15.91%]
        Memory allocated: 306.21 mb
   Number of allocations: 344 allocations
       Number of samples: 6
   Number of evaluations: 6
 Time spent benchmarking: 7.58 s

fft() version:

================ Benchmark Results ========================
     Time per evaluation: 715.45 ms [536.86 ms, 894.04 ms]
Proportion of time in GC: 20.37% [16.49%, 24.24%]
        Memory allocated: 569.87 mb
   Number of allocations: 276 allocations
       Number of samples: 13
   Number of evaluations: 13
 Time spent benchmarking: 10.56 s

It seems that the fft() version is the fastest while the rfft() version is 
the one that allocates less memory. According to my understanding, rfft() 
should also be faster with respect fft(). Unfortunately, the FFTW plan in 
the post I linked above does not seem to work. Are you able to suggest how 
to plan rfft() to make it faster? I will have to align many data 
collections sampled at 96 kHz minium, usually at least 30 s long. I plan to 
do it in a loopy structure, so it would be of great advantage to have 
quickest execution and lowest memory usage.

Here the code defining the functions:

function alignsignals{T <: Real}(x::Array{T, 1}, u::Array{T, 1})

# Delay as lag of cross correlation peak from origin of time.
lₓ = length(x)
lᵤ = length(u)
lₛ = lₓ + lᵤ - 1

sₓᵤ::Array{T, 1} = xcorr(x, u)

if mod(lₛ, 2) == 0
    ct_idx = fld(lₛ, 2)
else
    ct_idx = fld(lₛ, 2) + 1
end

_::Array{T, 1}, pk_idx::Array{Int, 1} = findmax(sₓᵤ, 1)

δ::Int = ct_idx - pk_idx[1]

# Align:
y = zeros(T, lᵤ)

if δ > 0
    y[1:(lᵤ - δ)] = u[(δ + 1):lᵤ]
else
    y[(-δ + 1):lᵤ] = u[1:(lᵤ - -δ)]
end

return y, δ

end

function rfftalignsignals{T <: Real}(x::Array{T, 1}, u::Array{T, 1}; ncores
::Int = 1)

lₓ = length(x)
lᵤ = length(u)

FFTW.set_num_threads(ncores)

x_nfft = nextprod(primes(10), lₓ)
u_nfft = nextprod(primes(10), lᵤ)
nfft = max(x_nfft, u_nfft)

X = rfft([x; zeros(nfft - lₓ)])
U = rfft([u; zeros(nfft - lᵤ)])

Sₓᵤ = conj(X) .* U

sₓᵤ = flipdim(fftshift(irfft(Sₓᵤ, 2 * length(Sₓᵤ) - 1)), 1)

lₛ = length(sₓᵤ)

if mod(lₛ, 2) == 0
    ct_idx = fld(lₛ, 2)
else
    ct_idx = fld(lₛ, 2) + 1
end

_::Array{T, 1}, pk_idx::Array{Int, 1} = findmax(sₓᵤ, 1)

δ::Int = ct_idx - pk_idx[1]

# Align:
y = zeros(T, lᵤ)

if δ > 0
    y[1:(lᵤ - δ)] = u[(δ + 1):lᵤ]
else
    y[(-δ + 1):lᵤ] = u[1:(lᵤ - -δ)]
end

return y, δ

end

function fftalignsignals{T <: Real}(x::Array{T, 1}, u::Array{T, 1}; ncores::
Int = 1)

lₓ = length(x)
lᵤ = length(u)

FFTW.set_num_threads(ncores)

x_nfft = nextprod(primes(10), lₓ)
u_nfft = nextprod(primes(10), lᵤ)
nfft = max(x_nfft, u_nfft)

X = fft([x; zeros(x_nfft - lₓ)])
U = fft([u; zeros(u_nfft - lᵤ)])

Sₓᵤ = conj(X) .* U

sₓᵤ = abs(flipdim(fftshift(ifft(Sₓᵤ)), 1))

lₛ = length(sₓᵤ)

if mod(lₛ, 2) == 0
    ct_idx = fld(lₛ, 2)
else
    ct_idx = fld(lₛ, 2) + 1
end

_::Array{T, 1}, pk_idx::Array{Int, 1} = findmax(sₓᵤ, 1)

δ::Int = ct_idx - pk_idx[1]

# Align:
y = zeros(T, lᵤ)

if δ > 0
    y[1:(lᵤ - δ)] = u[(δ + 1):lᵤ]
else
    y[(-δ + 1):lᵤ] = u[1:(lᵤ - -δ)]
end

return y, δ

end

[julia-users] Cross-correlation: rfft() VS fft() VS xcorr() performances

Reply via email to