Re: [julia-users] Re: compilation time depends on data size?

Ross Boylan Sat, 26 Jul 2014 12:01:25 -0700

On Sat, 2014-07-26 at 01:38 -0700, Ivar Nesje wrote:
> It is hard to test my assumptions when you don't post a complete
> runnable example, and profiling is a very useful tool in situations
> like this. It is also a little unclear what you mean by compile time,
> as Julia compiles the code while executing (JIT compiling).
Your assumptions all seem to have been correct.  Sorry I didn't include
the code; I wasn't sure if attachments were OK and I hadn't reduced it
to a minimal example.  Partially corrected version attached.


I should have said execute
include("/home/ross/UCSF/HIVRace/simpleMatch.jl"),
rather than "compile".

Profiling this indicates almost all the time going to the 
    affinity += beta[x[i], x[j]]
inside the loop you highlighted (next quoted section).  At least I think
it does; the profile says
no file; anonymous; line: 27
and that is line 27 of simpleMatch.jl.  I'm not sure why the file isn't
identified.  (Full profile at the bottom).

> 
> After a simple reading of the posted code the following line stood
> out .
> 
> 
> for i = 1:npop, j=1:npop 
>     affinity += beta[x[i], x[j]] 
> end 
> 
> This loop runs for 250 * 500 iterations and allocates a 500 * 500
Why isn't that 500*500 iterations?  BTW I lowered npop to 300 for the
profile.
>  array on each iteration (because for some reason Julia has decided
> that  x += y is not a special in place increment operation but just a
> short alias for x = x + y).  It seems to me that this could be written
> in a much more efficient way without creating temporary arrays.

How do you know a temporary is being created? 

Oh.. that code is completely wrong.  It should be
   affinity[i, j] += beta[x[i], x[j]]

So it was 500^3 operations.  With that fix, everything is back to
"instantaneous".  And the simulation results are looking much more
sensible.


> 
> Another thing that might cause performance grief is that you have 
> 
> 
> cut :: Vector{FloatingPoint}
> # Instead of 
> cut::Vector{Float64}
> 
Will do.
> FloatingPoint is a abstract type that lets you store different
> floating point types in the array (like Float16, Float32, Float64,
> Float128 and BigFloat). Unfortuenatly that means that for all
> operations on the array Julia needs to check what type the element has
> and find the right method for the operation. That is much slower than
> when it compile time knows the type to be Float64
> 
> 
> The last line:
> 
> 
> fill!(actors, Actor()) 
> 
> 
> 
> Seems like a bug, because the actors array will be filled with
> pointers to the same Actor object, and if you change one of them, all
> will be changed.

Thanks for helping me learn.  That was another big problem.

Here's the profile of the defective attached code:

WARNING: The profile data buffer is full; profiling probably terminated
before your program finished. To profile for longer runs, call
Profile.init
with a larger buffer and/or larger delay.
1     ...src/unix/linux-core.c; uv__hrtime; line: 333
13258 task.jl; anonymous; line: 96
 13258 REPL.jl; eval_user_input; line: 54
  13258 profile.jl; anonymous; line: 14
   13258 ./loading.jl; include_from_node1; line: 128
    13258 ./boot.jl; include; line: 245
     5     ./inference.jl; typeinf_ext; line: 1214  # I assume this is
the compiler
      1 ./inference.jl; typeinf; line: 1296
      1 ./inference.jl; typeinf; line: 1402
       1 ./inference.jl; abstract_interpret; line: 1088
        1 ./inference.jl; abstract_eval; line: 933
         1 ./inference.jl; abstract_eval_call; line: 881
      1 ./inference.jl; typeinf; line: 1509
       1 ./inference.jl; stupdate; line: 1167
      1 ./inference.jl; typeinf; line: 1525
       1 ./inference.jl; inlining_pass; line: 2522
        1 ./inference.jl; inlining_pass; line: 2559
         1 ./inference.jl; inlining_pass; line: 2623
          1 ./inference.jl; inlineable; line: 2354
           1 ./inference.jl; effect_free; line: 1976
            1 ./inference.jl; effect_free; line: 1976
             1 ./inference.jl; effect_free; line: 1927
              1 ./inference.jl; is_known_call_p; line: 2763
      1 ./inference.jl; typeinf; line: 1529
       1 ./inference.jl; tuple_elim_pass; line: 2937
        1 ./inference.jl; remove_redundant_temp_vars; line: 2790
         1 ./reduce.jl; mapreduce; line: 172
          1 ./reduce.jl; _mapreduce; line: 162
     13235 no file; anonymous; line: 27  # I assume this is in my
program file
      1    ...buv/src/unix/core.c; uv_hrtime; line: 86
      2450 array.jl; +; line: 770
       199  array.jl; .+; line: 756
       2250 array.jl; .+; line: 758
       1    array.jl; .+; line: 760
     2     random.jl; randn; line: 260
      1 random.jl; randn!; line: 257
       1 dSFMT.jl; randmtzig_randn; line: 598

Ross
> 
> Regards Ivar
> 
> 
> kl. 09:01:12 UTC+2 lørdag 26. juli 2014 skrev Ross Boylan følgende:
>         In julia 0.3, my module includes 
>         const npop = 500  #population size 
>         This takes a very noticeable (1 minute?) amount of time to
>         compile. 
>         
>         If I set const npop=50 instead, compilation is instant (i.e.,
>         no 
>         perceptible delay). 
>         
>         The module does create some Arrays that have npop as one of
>         their 
>         dimensions.  Some are 500 x 40. 
>         
>         Is this behavior expected? 
>         
>         I might believe different code is generated when npop is very
>         small, but 
>         for 50 is already well over what I'd call small. 
>         
>         Ross Boylan 
>         
>         P.S. the module level variables involving npop are (with
>         nstep=40, 
>         n1=250) 
>         
>         counts = zeros(Int, 3, nstep) 
>         
>         affinity = randn(npop, npop)  # will be modified later 
>         
>         x = [ i<=n1 ? 1 : 2 for i in 1:npop] 
>         
>         beta = [0.5 -0.5; -0.5 0.5] 
>         
>         for i = 1:npop, j=1:npop 
>             affinity += beta[x[i], x[j]] 
>             end 
>         
>         type Actor 
>             cut :: Vector{FloatingPoint}  # track cut point for each
>         time 
>             matches :: Matrix{Int}  # 2xtime successful matches with
>         each type 
>         end 
>         
>         Actor() = Actor(2ones(FloatingPoint, nstep), zeros(Int, 2,
>         nstep)) 
>         
>         actors = Array(Actor, npop) 
>         fill!(actors, Actor()) 
>         
>         I also specify the random seed, and so in principle everything
>         could be 
>         set up at compile time.   
>

module Sim

const npop = 300  #population size
const nstep = 40  # number of time steps
const n1 = floor(npop*0.5)  # 1:n1 have type 1, rest are type 2
const delta_cut = 0.1  # amount to move cut
const ncandidates = 5  # number of partners to try

srand(1234) # reproducible random numbers

# counts records successful matches for all actors at a given time
# the count is of the matches, not the view from individuals
# first index 1 = 1-1 match; 2 = 2-2 match; 3 = 1-2 or 2-1 match
counts = zeros(Int, 3, nstep)

# affinity[i, j] is how much i likes j as a partner
# we begin with the random component
affinity = randn(npop, npop)  # will be modified later

# covariates.  just one for type 1 or 2
x = [ i<=n1 ? 1 : 2 for i in 1:npop]

#beta[i, j] what type i thinks of type j
beta = [0.5 -0.5; -0.5 0.5]

for i = 1:npop, j=1:npop
    affinity += beta[x[i], x[j]]
    end
# affinity is now set and assumed constant across time

type Actor
    cut :: Vector{FloatingPoint}  # track cut point for each time
    matches :: Matrix{Int}  # 2xtime successful matches with each type
end

Actor() = Actor(ones(FloatingPoint, nstep), zeros(Int, 2, nstep))

actors = [Actor() for i in 1:npop]


# return tuple of boolean that is true if enough data, estimated success rate
# as of time now
function success_rate(a::Actor, now::Int)
    const max_window = 15
    const min_window = 4
    if now == 1
        return false, 0.0
        end
    cut = a.cut[now]
    lowest = max(1, now-max_window)
    first_same = now
    for first_same in now-1:-1:lowest
        if cut != a.cut[first_same]
            first_same += 1
            break
            end
        end
    if first_same+min_window-1 > now
        return false, 0.0
        end
    # we have enough for an average
    # might weight this by time at some point
    return true, mean(sum(a.matches[:, first_same:now], 1))
    end

# set cut for now+1
function adjust_cut!(a::Actor, now::Int)
    const target_rate = 1.0  # desired matches per tick
    enough, rate = success_rate(a, now)
    cutnow = a.cut[now]
    if ! enough
        a.cut[now+1] = cutnow
     else
         if rate > target_rate*1.15
             a.cut[now+1] = cutnow+delta_cut
         elseif rate < target_rate*.85
             a.cut[now+1] = cutnow-delta_cut
         else
             a.cut[now+1] = cutnow
             end
        end
    end

# return true if j is an acceptable match for i
# references global affinity and actors
function accept(i::Int, j::Int, now::Int)
    global affinity, actors
    return affinity[i, j] >= actors[i].cut[now]
    end

# record a successful match
function match!(i::Int, j::Int, now::Int)
    global actors, counts
    ti = x[i]
    tj = x[j]
    if ti == tj
        counts[ti, now] += 1
    else
        counts[3, now] += 1
    end
    actors[i].matches[tj, now] += 1
    actors[j].matches[ti, now] += 1
    end
        
function go()
    for time = 1:nstep
        for i = 1:npop
            js = unique(rand(1:npop,  ncandidates))
            for j in js
                if i != j && accept(i, j, time) && accept(j, i, time)
                    match!(i, j, time)
                    end
                end
            end
        # must adjust cuts after all matches done
        if time < nstep
            for i = 1:npop
                adjust_cut!(actors[i], time)
                end
            end
        end  # of time loop
    counts
    end

# analysis

# If we did a census and asked each person to list their
# partners these are the totals we would get per group
# indices are source group, dest group, time
function count_partners(counts::Matrix{Int})
    dims = size(counts)
    function counter(i::Int, j::Int, t::Int)
        if i == j
            2*counts[i, t]
        else
            counts[3, t]
        end
        end
        
    [counter(i, j, t) for i=1:2, j=1:2, t=1:nstep]
    end

# get self share from the ties NOT the partner counts
function share(counts::Matrix{Int})
    cs = count_partners(counts)
    function inner(i::Int, t::Int)
        cs[i, i, t]/sum(cs[i, :, t])
        end
    [inner(i, t) for i=1:2, t=1:nstep]
    end


end

Re: [julia-users] Re: compilation time depends on data size?

Reply via email to