The performance penalty from global variables comes from the fact that 
non-const global variables are type unstable, which means that the compiler 
has to generate very defensive, and thus slow, code. The same thing is, for 
now, true also for anonymous functions and functions passed as parameters 
to other functions, so switching one paradigm for the other will 
unfortunately not help much for performance. I haven't looked closely at 
your code, but the 
[FastAnonymous.jl](https://github.com/timholy/FastAnonymous.jl) package can 
probably help with making the anonymous function version faster.

If the data you want to pass along doesn't change, using *const* globals is 
fine (at least from a performance perspective, you might have other reasons 
to avoid them too...), i.e. you can do the following and it will also be 
faster:

```
const data = 3

function model(x)
    # use data somehow
end

# use the model function in your optimization
```

If you need the data to change, you can still use a `const` global 
variable, but set it to some mutable (and type-stable!) object, for example

```
type MyData
    data::Int
end

const data = MyData(3)

function model(x)
    data.data += 1
    # ...
end

# use the model function in your optimization
# it can now modify the global data on each invocation
```

This doesn't necessarily provide the cleanest interface, but it should be 
more performant than your current solution.

// T
    

On Monday, September 28, 2015 at 6:53:22 PM UTC+2, Christopher Fisher wrote:
>
> Thanks for your willingness to investigate the matter more closely. I 
> cannot post the exact code I am using (besides its rather long). However, I 
> posted a toy example that follows the same basic operations. Essentially, 
> my code involves an outer function (SubLoop) that loops through a data set 
> with multiple subjects. The model is fit to each subject's data. The other 
> function (LogLike) computes the log likelihood and is called by optimize. 
> The first set of code corresponds to the closure method and the second set 
> of code corresponds to the global variable method. In both cases, the code 
> executed in about .85 seconds over several runs on my computer and has 
> about 1.9% gc time. I'm also aware that my code is probably not optimized 
> in other regards. So I would be receptive to any other advice you might 
> have. 
>
>
>  
>
> using Distributions,Optim
>
> function SubLoop1(data1)
>
>     function LogLike1(parms) 
>
>         L = pdf(Normal(parms[1],exp(parms[2])),SubData)
>
>         LL = -sum(log(L))
>
>     end
>
>     #Number of Subjects
>
>     Nsub = size(unique(data1[:,1],1),1)
>
>     #Initialize per subject Data
>
>     SubData = []
>
>     for i = 1:Nsub
>
>         idx = data1[:,1] .== i
>
>         SubData = data1[idx,2]
>
>         parms0 = [1.0;1.0]
>
>         optimize(LogLike1,parms0,method=:nelder_mead)
>
>     end
>
> end
>
>  
>
> N = 10^5
>
> #Column 1 subject index, column 2 value
>
> Data = zeros(N*2,2)
>
> for sub = 1:2
>
>     Data[(N*(sub-1)+1):(N*sub),:] = [sub*ones(N) rand(Normal(10,2),N)]
>
> end
>
> @time SubLoop1(Data)
>
>
>
>
>
>
>
>
>
>
>
> Using Distributions, Optim
>
> function SubLoop2(data1)
>
>     global SubData
>
>     #Number of subjects
>
>     Nsub = size(unique(data1[:,1],1),1)
>
>     #Initialize per subject data
>
>     SubData = []
>
>     for i = 1:Nsub
>
>         idx = data1[:,1] .== i
>
>         SubData = data1[idx,2]
>
>         parms0 = [1.0;1.0]
>
>         optimize(LogLike2,parms0,method=:nelder_mead)
>
>     end
>
> end
>
>  
>
> function LogLike2(parms) 
>
>     L = pdf(Normal(parms[1],exp(parms[2])),SubData)
>
>     LL = -sum(log(L))
>
> end
>
>  
>
> N = 10^5
>
> #Column 1 subject index, column 2 value
>
> Data = zeros(N*2,2)
>
> for sub = 1:2
>
>     Data[(N*(sub-1)+1):(N*sub),:] = [sub*ones(N) rand(Normal(10,2),N)]
>
> end
>
> @time SubLoop2(Data)
>
>
>
> On Monday, September 28, 2015 at 11:24:13 AM UTC-4, Kristoffer Carlsson 
> wrote:
>>
>> From only that comment alone it is hard to give any further advice. 
>>
>> What overhead are you seeing?
>>
>> Posting runnable code is the best way to get help.
>>
>>

Reply via email to