Hi Piet
Glad to see that you got the lbfgs to work. I had sent an email to the J
Forum a while back but I see that it might not have gotten through. It is
provided below. Like you, I used D: to estimate the partial derivatives. As
I state, I found the function quite slow and the NLLS much faster. I haven't
tried lbfgs on bigger problems but my suspicion is that the calls slow done
the function considerably.
Bob
----------------------
Quite a while back, I got this working with the help of Philip Viton. A
fully worked example is below. For comparison, I did the optimization in
Excel and using the NLLS Levenberg-Marquardt routine that I have developed
(that's in the scripts on the J website). I found the LBFGS very slow
compared to the NLLS. It would be great to have a fast maximum likelihood
optimizer that is fully with J.
Hope that these help.
Bob
load '~addons/math/lbfgs/lbfgs.ijs'
NB.===========================================================
NB.
NB. Von Bertalanfy fit using LBFGS optimizatin
NB. outputs stats matrix & parm vector
NB. solution with EXCEL's SOLVER is linf = 55.9778, k = 0.385595, t0 =
0.171371
NB.
NB.========================================================================
NB. INPUTS
years =: 1.0 2.0 3.3 4.3 5.3 6.3 7.3 8.3 9.3 10.3 11.3
length =: 15.4 26.93 42.23 44.59 47.63 49.67 50.87 52.3 54.77 56.43 55.88
initial =: 30.0 0.20 0.01
NB.=======================================================================
VON_B =: 3 : 0
'a b c' =. y
l_pred =. a * (1 -^ - (b * years - c))
rss =. e +/ .* e =. length - l_pred
n2 =: ($years)%2
NB. sd =: %: var =: rss%$e
NB. (($years)%2)*((^.(o.2))+(2*^.sd) +1)
NB. (($years)%2)*((^.(o.2))+(^.var) +1)
NB. (n2*(^.rss))-(n2*^.$years)-(n2*(^.o.2))+n2
NB. (n2*(^.var))+(n2*(^.o.2))+n2
n2*(^.rss)
)
NB.========================================================================
DERIVLOGL =: 3 : 0
epsd =. 1e_5
f0 =. OBJ_FN y
grad =. 0$0
for_i. i.#y do.
yy =. (epsd+i{y) i } y
f1 =. OBJ_FN yy
yyy =.((-epsd)+i{y) i } y
f2 =. OBJ_FN yyy
grad =. grad,(f1-f2)%+:epsd
end.
)
LBFGS =: 4 : 0
OBJ_FN =: (<x)`:6
n =. $parm =. y NB. length n; number of parameters
m =. 5 NB. length 1; number of corrections used in BFGS update.
Should be between 3 and 7
diagco =. 2-2 NB. length 1; Logical variable; set at 0; if 1.0,
need to provide diagonal matrix
diag =. n$2.2-2.2 NB. length n$0; provide double array if diagco 1
iprint =. 2 2-2 2 NB. length 2; only 0 0 implemented
eps =. 1.0e_5 NB. Accuracy of solution: grad < eps (max of 1,
fun_parm)
machine =. 1.0e_16 NB. estimate of machine presision
w =. (n*(2*m+1)+2*m) $ 2.2-2.2 NB. workspace size for LBFGS
iflag =. 2-2 NB. initially set to 0; codes which indicate
performance of routine
niter =. 0
while. niter <: 1000 do.
fun_parm =. OBJ_FN parm NB. Value of objective function at parm
grad =. ((0.00001*parm) + (0.00000001*parm = 0)) OBJ_FN D: 1 parm NB.
length n; gradient of objective function at parm
NB. grad =. DERIVLOGL parm
'n m parm fun_parm grad diagco diag iprint eps machine w iflag'=.r=.'lbfgs_'
call (n;m;parm;fun_parm;grad;diagco;diag;iprint;eps;machine;w;iflag)
if. iflag <: 0 do. break. end.
output =: niter, ; 3 2 4{r NB. iter, value, point, grad
niter =. >: niter
end.
)
'VON_B' LBFGS initial
--------------------------------------------
-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Piet de Jong
Sent: April 11, 2010 11:08 PM
To: Programming forum
Subject: Re: [Jprogramming] lbfgs minimization routine
I worked more on this and the code below seems to do things
Instead of my own numerical derivatives I let J itself provide
the derivatives (this seems to work well on situations I have tried),
Anyway below "minl" is an adverb which when applied to
function F and given a starting value for the function, X, returns
the the minimum, minimizer and partial derivatives at the minimum, as
well as the number of iterations.
minl=: 1 : 0 NB. lbfgs minimization
f=.u
N=.#X=:2.2+(,y)-2.2 NB. ensures X is real
M=.5 NB. number of corrections
'DIAGCO IFLAG'=.2-2 NB. logical set to 0=false
DIAG=.N$2.2-2.2 NB. real set to 0
IPRINT=.2 2-2 2 NB. integer: only 0 0 permitted
'EPS XTOL'=.1.0e_5 1.0e_16 NB. real epsilon and tolerance
W=.((N*(>:+:M))++:M)$2.2-2.2 NB. real work vector
iter=.0
whilst. (IFLAG>0)*.(iter<:2000) do. iter=.>:iter
'F G'=. (f;f D. 1) X NB. function and derivative value
'N M X F G DIAGCO DIAG IPRINT EPS XTOL W IFLAG'=.'lbfgs_' call
(N;M;X;F;G;DIAGCO;DIAG;IPRINT;EPS;XTOL;W;IFLAG)
end.
F;X;G;iter
)
The "call" is as is in the lbfgs addon package.
On Thu, Apr 8, 2010 at 6:17 PM, Piet de Jong <[email protected]> wrote:
> Thanks for looking at this.
> What confused me was the "larger" dimension on X etc
> For example the code working off
> NDIM=. 2000
>
> Anyway after a lot of fiddling I seem to have it working at least with
> simple examples,
> including with the use of numerical rather than exact derivatives.
> Here is my "test" case
> using N=4 with the sum of squares function centered on an (arbitrary)
origin.
>
> NB. Df is exact derivative, while Dfh is numerical derivative using h
>
> mtr=: 3 : 0 NB. lbfgs test
> 'N M'=.4 5 NB. dimension of X, number of corrections
> f=.+/@:*:@:-&3 _2002 800 1 NB. example sum of squares function
>
> Df=.f D. 1 NB. exact derivative
> h=.1.0e_8 NB. h for numerical derivative
> H=.(h&*)@=...@i.@# NB. diag h
> Dfh=.-:@(%&h)@:((+(-&f)-)H) NB. numerical derivative
>
> 'DIAGCO IFLAG'=.2-2 NB. logical set to 0=false
> DIAG=.N$2.2-2.2 NB. real set to 0
> IPRINT=.2 2-2 2 NB. integer: only 0 0 permitted
> 'EPS XTOL'=.1.0e_5 1.0e_16 NB. real epsilon and tolerance
> W=.((N*(>:+:M))++:M)$2.2-2.2 NB. real work vector
>
> X=.N$400.2-2.2 NB. initialize X
> iter=.0 NB. initialize iter
> whilst. (IFLAG>0)*.(iter<:2000) do. iter=.>:iter
> 'F G'=. (f;Dfh) X NB. funtion and derivative value -- use Dfh
> for numerical derivative
> 'N M X F G DIAGCO DIAG IPRINT EPS XTOL W IFLAG'=.'lbfgs_' call
> (N;M;X;F;G;DIAGCO;DIAG;IPRINT;EPS;XTOL;W;IFLAG)
> end.
> iter;(f;])X
> )
>
>
>
>
>
> On Thu, Apr 8, 2010 at 7:08 AM, Raul Miller <[email protected]> wrote:
>> On Wed, Apr 7, 2010 at 4:38 PM, Piet de Jong <[email protected]> wrote:
>>> Thanks for the response. I downloaded the routine using the package
manager
>>> and J6.02.
>>>
>>> What I don't get is that the in the example lbfgs_test the routine is
>>> called with parameter
>>> values that do not correspond to what is stated in the provided
>>> FORTRAN description of the calling sequence.
>>
>> So... looking at math/lbfgs, I see:
>> path=: jpath '~addons\math\lbfgs\'
>> dll=:'"',path,'lbfgs.dll" '
>> ...
>> call=: 4 : 0
>> xx=. dll,x,' + i ',(+:#y)$' *'
>> r=. xx cd LASTIN=: , each y
>> ...
>>
>> and, I see:
>>
>> 'N M X F G DIAGCO DIAG IPRINT EPS XTOL W IFLAG'=.r=.'lbfgs_' call
>> (N;M;X;F;G;DIAGCO;DIAG;IPRINT;EPS;XTOL;W;IFLAG)
>>
>> So that means that the calling signature will be
>>
>> "...lbfgs.dll" lbfgs_ + i * * * * * * * * * * * *
>>
>> According to http://www.jsoftware.com/help/user/call_procedure.htm
>> this means that under windows this routine uses the __cdecl calling
>> convention, returns a 4 byte integer result and takes 12 pointers as
>> arguments.
>>
>> I modified lbfgs_test to capture the data being passed to "call" and
>> when I interrupted the program the values were all one dimension
>> arrays, with dimensions:
>> N: 1
>> M: 1
>> X: 2000
>> F: 1
>> G: 2000
>> DIAGCO: 1
>> DIAG: 2000
>> IPRINT: 2
>> EPS: 1
>> XTOL: 1
>> W: 60000
>> IFLAG: 1
>>
>> Now... I do not know how lbfgs works, and I do not know
>> if there is some windows magic that happens to pass the
>> size of these arrays in a __cdecl call, or if the pointers to
>> these values are treated as pointers regions of memory where
>> extra values off the end would be ignored.
>>
>> But it very much looks like this code is directly calling
>> lbfgs.dll and not doing much of any thing special to
>> the data on its way out.
>>
>>> In particular and as an example, the dimension on X does not appear to
>>> be N but something much bigger. Hence the example is hard, if not
>>> impossible to follow. It might be useful to have a much simpler
>>> example. I still can't get a simple example to work since I don't
>>> appear to understand the calling parameters.
>>
>> My current best guess is that (if N is supposed to be the
>> number of elements provided by X), values after the first 100 (the
>> value for N) are ignored by lbfgs.dll.
>>
>> Do you see anything which would contradict this point of view?
>>
>> Thanks,
>>
>> --
>> Raul
>> ----------------------------------------------------------------------
>> For information about J forums see http://www.jsoftware.com/forums.htm
>>
>
>
>
> --
> Piet de Jong
> --------------------------------------------------
> View my current research at
> http://ssrn.com/author=619154
> --------------------------------------------------
>
--
Piet de Jong
--------------------------------------------------
View my current research at
http://ssrn.com/author=619154
--------------------------------------------------
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm