On Sat, Oct 31, 2015 at 8:23 AM,  <[email protected]> wrote:
> Hi,
> In Julia one continuously runs into recommendations to de-vectorize ones
> code and instead should write loops. This is appealing as conceptually,
> loops tend to be more straight-forward in many cases. However, when trying
> to follow this, I often observed this to not be the case and vector-based
> code to be far superior in terms of runtime.
>
> An example is attached:
>
>
>
> function
> emp_L_loop(x::Array{Float64,2},f::Array{Float64,2},fct_metric_inp::Function,fct_metric_outp::Function)
>      #returns empirical Hoelder const L computed on the basis of sample in
> dat
>      ns = size(x,2)#number of sample pts
>      if ns <=1
>       return 0
>      end
>      L = 0.
>      vx = x[:,1]
>      vx2 = x[:,1]
>      vf = f[:,1]
>      vf2 = f[:,1]
>      for i=1:ns
>        for j=1:i-1
>        vx = x[:,i]
>        vx2 = x[:,j]
>         dx = fct_metric_inp(vx,vx2)
>         if dx >0
>         vf = f[:,i]
>         vf2 = f[:,j]
>         L = max(L,(fct_metric_outp(vf,vf2)) ./ dx)    #Lip const estimator
>         end
>        end
>      end
>    return L
>    end
>

This is not "devectorizing" at all. Your "loop" is creating a large
number of arrays and that will certainly slow things down.

If you only want to make small change to your code above, hoist the
indexing on i out of the loop over j and use SubArray or ArrayView
instead of creating a new array everytime. (which will hopefully be
the default sometime in 0.5).

If you actually want to devectorize it, write out the loop over each element.

>
>
> function
> emp_L_vectorized(x::Array{Float64,2},f::Array{Float64,2},fct_metric_inp::Function,fct_metric_outp::Function)
>      ns = size(x,2)#number of sample pts
>
>      if ns <=1
>       return 0
>      end
>      L = 0
>      for j=1:ns
>        if j+1 < ns
>          xcmp = x[:,j+1:end]
>          fcmp = f[:,j+1:end]
>          nxcmp = size(xcmp,2)
>          X = repmat(x[:,j],1,nxcmp)
>          F = repmat(f[:,j],1,nxcmp)
>          m_rowvec=fct_metric_inp(X,xcmp) #
>          inds = m_rowvec .> 0
>          m_rowvec = (m_rowvec[inds])
>          diffs_f = fct_metric_outp(F,fcmp)
>          diffs_f = diffs_f[inds]
>          #diffs_f = abs(sample_f_old-f[i]) - hpa.alpha
>          L =max(L,maximum(diffs_f./m_rowvec));
>       end
>      end
>     return L
>    end
>
> #--------------------metrics-----------------:
>   # --------------- assume they compute metrics between x[:,i] and y[:,i]
> for i =1...
>   # result is a row vector of distances
>    vecmetric_maxnorm(x::Array{Float64,2},y::Array{Float64,2}) =
> maximum(abs(x-y),1)#treats matrices as collections of vectors
>    vecmetric_maxnorm(x::Array{Float64,1},y::Array{Float64,1}) =
> maximum(abs(x-y))
>
> # ========= main code:
>
> x = 1.*rand(1,4000)
> f = (x) -> sqrt(x)
>
> fx = f(x)
>
>
> @time emp_L_loop(x,fx,vecmetric_maxnorm,vecmetric_maxnorm)
> # result: elapsed time: 14.265720419 seconds (6526748828 bytes allocated,
> 64.87% gc time)
>
>
> @time emp_L_vectorized(x,fx,vecmetric_maxnorm,vecmetric_maxnorm)
> #elapsed time: 1.912960284 seconds (860891136 bytes allocated, 67.08% gc
> time)
>
>
> I have also observed that the vectorized metrics (attached) outperform the
> standard implementations on my machine.
> What am I doing wrong?
>
>

Reply via email to