>
> 40 days old master

The recent tuple changes (#10380) were merged after that and represent a
very substantial change, so comparing top-of-trunk to a 40-day old version
isn't very useful. I would suggest to try a comparably recent (ideally
identical) build under the VM before drawing any conclusions.

Also, this could very well be a test case for a performance regression from
#10380.

On Wed, Apr 29, 2015 at 4:50 PM, Spencer Lyon <[email protected]>
wrote:

> I ran into strange performance issues in an algorithm I have been working
> on.
>
> I have a test case as well as some timing and profiler results at this
> gist: https://gist.github.com/spencerlyon2/d21d6368a2ccbf6f1e7b
>
>
> I summarize the issues here. Consider the following code (note I am
> defining myexp because of this issue:
> https://github.com/JuliaLang/julia/issues/11048. It turns out that on OS
> X, calling apple's libm gives a substantial speed up -- e.g. I'm doing
> everything I can to give OS X a chance to win here)
>
> the code:
>
> @osx? (
>          begin
>              myexp(x::Float64) = ccall((:exp, :libm), Float64, (Float64,), x)
>              # myexp(x::Float64) = exp(x)
>          end
>        : begin
>              myexp(x::Float64) = exp(x)
>          end
>        )
>
> function test_func(data::Matrix, points::Matrix)
>     # extract input dimensions
>     n, d = size(data)
>     n_points = size(points, 1)
>
>     # transpose data and points to access columns at a time
>     data = data'
>     points = points'
>
>     # Define constants
>     hbar = n^(-1.0/(d+4.0))
>     hbar2 = hbar^2
>     constant = 1.0/(n*hbar^(d) * (2π)^(d/2))
>
>     # allocate space
>     density = Array(Float64, n_points)
>     Di_min = Array(Float64, n_points)
>
>     # apply formula (2)
>     for i=1:n_points  # loop over all points
>         dens_i = 0.0
>         min_di2 = Inf
>         for j=1:n_points  # loop over all other points
>             d_i2_j = 0.0
>             for k=1:d  # loop over d
>                 @inbounds d_i2_j += ((points[k, i] - data[k, j])^2)
>             end
>             dens_i += myexp(-0.5*d_i2_j/hbar2)
>             if i != j && d_i2_j < min_di2
>                 min_di2 = d_i2_j
>             end
>         end
>         density[i] = constant * dens_i
>         Di_min[i] = sqrt(min_di2)
>     end
>
>     return density, Di_min
> end
>
>
>
> To test the performance of this code on linux and OS X, I started up a
> docker image with a recent (40 days old master) julia from my OS X machine
> and compared the timing against running it on OS X directly (with 1 days
> old julia). I found that for `data, points = randn(9500, 2)` on linux
> version takes about 2.6 seconds to run `test_func` whereas on OS X  it
> takes about 9.3.
>
> I can't explain this large (almost 4x) performance hit that I get from
> running the code on the native OS vs the virtual machine.
>
> More details (profiler results, timing stats, self-contained runnable
> example) in the gist:
> https://gist.github.com/spencerlyon2/d21d6368a2ccbf6f1e7b
>
>
>
>

Reply via email to