Yes, to use autodiff you need to make sure that all of the functions you call
could be applied to Array{T} for all T <: Number. The typing on your code is
currently overly restrictive when you define clogit_ll(beta::Vector{Float64})
and friends. If you loosen things to clogit_ll(beta::Vector), you might get
autodiff to work.
— John
On May 20, 2014, at 8:42 AM, Holger Stichnoth <[email protected]> wrote:
> When I set autodiff = true in the Gist I posted above, I get the message
> "ERROR: no method clogit_ll(Array{Dual{Float64},1},)".
>
> Holger
>
>
> On Monday, 19 May 2014 14:51:16 UTC+1, John Myles White wrote:
> If you can, please do share an example of your code. Logit-style models are
> in general numerically unstable, so it would be good to see how exactly
> you’ve coded things up.
>
> One thing you may be able to do is use automatic differentiation via the
> autodiff = true keyword to optimize, but that assumes that your objective
> function is written in completely pure Julia code (which means, for example,
> that your code must not call any of functions not written in Julia provided
> by Distributions.jl).
>
> — John
>
> On May 19, 2014, at 4:09 AM, Andreas Noack Jensen <[email protected]>
> wrote:
>
>> What is the output of versioninfo() and Pkg.installed("Optim")? Also, would
>> it be possible to make a gist with your code?
>>
>>
>> 2014-05-19 12:44 GMT+02:00 Holger Stichnoth <[email protected]>:
>> Hello,
>>
>> I installed Julia a couple of days ago and was impressed how easy it was to
>> make the switch from Matlab and to parallelize my code
>> (something I had never done before in any language; I'm an economist with
>> only limited programming experience, mainly in Stata and Matlab).
>>
>> However, I ran into a problem when using Optim.jl for Maximum Likelihood
>> estimation of a conditional logit model. With the default Nelder-Mead
>> algorithm, optimize from the Optim.jl package gave me the same result that I
>> had obtained in Stata and Matlab.
>>
>> With gradient-based methods such as BFGS, however, the algorithm jumped from
>> the starting values to parameter values that are completely different. This
>> happened for all thr starting values I tried, including the case in which I
>> took a vector that is closed to the optimum from the Nelder-Mead algorithm.
>>
>> The problem seems to be that the algorithm tried values so large (in
>> absolute value) that this caused problems for the objective
>> function, where I call exponential functions into which these parameter
>> values enter. As a result, the optimization based on the BFGS algorithm did
>> not produce the expected optimum.
>>
>> While I could try to provide the analytical gradient in this simple case, I
>> was planning to use Julia for Maximum Likelihood or Simulated Maximum
>> Likelihood estimation in cases where the gradient is more difficult to
>> derive, so it would be good if I could make the optimizer run also with
>> numerical gradients.
>>
>> I suspect that my problems with optimize from Optim.jl could have something
>> to do with the gradient() function. In the example below, for instance, I do
>> not understand why the output of the gradient function includes values such
>> as 11470.7, given that the function values differ only minimally.
>>
>> Best wishes,
>> Holger
>>
>>
>> julia> Optim.gradient(clogit_ll,zeros(4))
>> 60554544523933395e-22
>> 0Op
>> 0
>> 0
>>
>> 14923.564009972584
>> -60554544523933395e-22
>> 0
>> 0
>> 0
>>
>> 14923.565228435104
>> 0
>> 60554544523933395e-22
>> 0
>> 0
>>
>> 14923.569064311248
>> 0
>> -60554544523933395e-22
>> 0
>> 0
>>
>> 14923.560174904109
>> 0
>> 0
>> 60554544523933395e-22
>> 0
>>
>> 14923.63413848258
>> 0
>> 0
>> -60554544523933395e-22
>> 0
>>
>> 14923.495218282553
>> 0
>> 0
>> 0
>> 60554544523933395e-22
>>
>> 14923.58699717058
>> 0
>> 0
>> 0
>> -60554544523933395e-22
>>
>> 14923.54224130672
>> 4-element Array{Float64,1}:
>> -100.609
>> 734.0
>> 11470.7
>> 3695.5
>>
>> function clogit_ll(beta::Vector)
>>
>> # Print the parameters and the return value to
>> # check how gradient() and optimize() work.
>> println(beta)
>> println(-sum(compute_ll(beta,T,0)))
>>
>> # compute_ll computes the individual likelihood contributions
>> # in the sample. T is the number of periods in the panel. The 0
>> # is not used in this simple example. In related functions, I
>> # pass on different values here to estimate finite mixtures of
>> # the conditional logit model.
>> return -sum(compute_ll(beta,T,0))
>> end
>>
>>
>>
>>
>>
>>
>>
>>
>> --
>> Med venlig hilsen
>>
>> Andreas Noack Jensen
>