On Mon, Aug 03, 2009 at 01:49:45PM +0200, Olaf Till wrote:
> On Mon, Aug 03, 2009 at 01:07:23PM +0200, Olaf Till wrote:
> > I don't know if there are theoretical objections against bounds in
> > Levenberg-Marquardt optimization, but I also ended up introducing
> > bounds into leasqr for my use; I would think this is general
> > purpose.
>
> Sorry, I remember now that there were some problems. Often, a bound is
> desirable because some parameters do not produce useful results (of
> the user function) outside a certain range (e.g. a division by zero
> might occur). In such a case, the bounds should already be respected
> during gradient determination, not only later in the 'step'. This
> would mean that if a parameter is already at its bound, only a
> _one_-sided gradient can be computed for this parameter, but only on
> the 'side' away from the bound. Also, in two-sided bounds, the
> possible parameter-change for gradient estimation could become small
> (too small for some problems?). Since currently gradient estimation in
> leasqr is done by different function (dfdp), which knows nothing on
> bounds, implementation is not so easy. One could pass the
> bounds-argument also to dfdp, but there might exist user-replacements
> for dftp (there is an argument to leasqr just to choose this function)
> which do not honour the 'bounds' argument ... so people can not expect
> bounds to work without changing their dfdp-replacement.
Some new suggestions.
-- "bounds" in leasqr
Attached are new versions of leasqr.m and dfdp.m. They handle "bounds"
in steps and in gradient estimation. The "options" argument is used
for this and is now a structure of options. For backwards
compatibility, the old options-matrix is still recognized. A lot of
cosmetic changes have been made to make leasqr.m better editable under
emacs, so almost each line is changed and its not much use making
diffs :-(. I have tested it a bit. Since leasqr does not seem to have
an "assigned" active maintainer, I would commit it and change the
version number of the optim package, but Soeren please decide on that.
Roessli, if there is nothing wrong with the new leasqr version I
suggest you rewrite your code to make use of it, and make it depend on
the new version of the optim package. The computations according to
Bragg you had put inside it can surely be made _after_ calling leasqr?
-- "bounds" in nmsmax
I am not familiar with the method in nmsmax, but at first glance I do
not see where you introduced "bounds" ...
-- keeping parameters "fixed"
The way you have done this in nmsmax.m could break existing code since
the new argument is not the last. And it can not be made the last due
to the already existing varargin argument. (BTW, why not using an
argument which is a logical vector, true for each parameter to be
fixed?) To be on the safe side, I would suggest you to solve the
"fixing" problem with a wrapper, which could be left in your package
and need not necessarily put into the optim package. Something like
the following sketch (may have overlooked some details):
function [outp1, outp2, ...] = optim_wrapper (fixed, varargin)
## according to index "fixed", remove fixed parameters and possibly
## other related vector-elements
...
## replace name of (or handle to) user-function with an anonymous
## function like that (calls the here defined subfunction)
@ (x, p) local_func (x, p, user_func, fixed, values_of_fixed_parameters)
...
## call optimization routine
[oup1, outp2, ...] = call_optimizer (varargin(:));
## reintroduce fixed parameters into output arguments if necessary
...
## define subfunction
function ret = local_func (x, p, user_func, fixed, values_fixed)
## reintroduce fixed parameters into "p" according to "fixed" and
## "values_fixed"
...
## call user function
ret = feval (user_func, x, p);
% Copyright (C) 1992-1994 Richard Shrager
% Copyright (C) 1992-1994 Arthur Jutan
% Copyright (C) 1992-1994 Ray Muzic
%
% This program is free software; you can redistribute it and/or modify
% it under the terms of the GNU General Public License as published by
% the Free Software Foundation; either version 2 of the License, or
% (at your option) any later version.
%
% This program is distributed in the hope that it will be useful,
% but WITHOUT ANY WARRANTY; without even the implied warranty of
% MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
% GNU General Public License for more details.
%
% You should have received a copy of the GNU General Public License
% along with this program; If not, see <http://www.gnu.org/licenses/>.
function prt=dfdp(x,f,p,dp,func,bounds)
% numerical partial derivatives (Jacobian) df/dp for use with leasqr
% --------INPUT VARIABLES---------
% x=vec or matrix of indep var(used as arg to func) x=[x0 x1 ....]
% f=func(x,p) vector initialsed by user before each call to dfdp
% p= vec of current parameter values
% dp= fractional increment of p for numerical derivatives
% dp(j)>0 central differences calculated
% dp(j)<0 one sided differences calculated
% dp(j)=0 sets corresponding partials to zero; i.e. holds p(j) fixed
% func=string naming the function (.m) file
% e.g. to calc Jacobian for function expsum prt=dfdp(x,f,p,dp,'expsum')
% bounds=two-column-matrix of lower and upper bounds for parameters
%----------OUTPUT VARIABLES-------
% prt= Jacobian Matrix prt(i,j)=df(i)/dp(j)
%================================
m=size(x,1); if (m==1), m=size(x,2); endif %# PAK: in case #cols > #rows
n=length(p); %dimensions
prt=zeros(m,n);del=zeros(n,1); % initialise Jacobian to Zero
del = dp .* p; %cal delx=fract(dp)*param value(p)
idx = p == 0;
del(idx) = dp(idx); %if param=0 delx=fraction
idx = dp > 0;
del(idx) = abs (del(idx)); % not for one-sided intervals, changed
% direction of intervals could change
% behavior of optimization without bounds
min_del = min (abs (del), bounds(:, 2) - bounds(:, 1));
for j=1:n
ps = p;
if (dp(j)~=0)
if (dp(j) < 0)
ps(j) = p(j) + del(j);
if (ps(j) < bounds(j, 1) || ps(j) > bounds(j, 2))
t_del1 = max (bounds(j, 1) - p(j), - abs (del(j))); %
%non-positive
t_del2 = min (bounds(j, 2) - p(j), abs (del(j))); %
%non-negative
if (- t_del1 > t_del2)
del(j) = t_del1;
else
del(j) = t_del2;
endif
ps(j) = p(j) + del(j);
endif
prt(:, j) = (feval (func, x, ps) - f) / del(j);
else
if (p(j) - del(j) < bounds(j, 1))
tp = bounds(j, 1);
ps(j) = tp + min_del(j);
elseif (p(j) + del(j) > bounds(j, 2))
ps(j) = bounds(j, 2);
tp = ps(j) - min_del(j);
else
ps(j) = p(j) + del(j);
tp = p(j) - del(j);
min_del(j) = 2 * del(j);
endif
f1 = feval (func, x, ps);
ps(j) = tp;
prt(:, j) = (f1 - feval (func, x, ps)) / (min_del(j));
endif
endif
endfor
% Copyright (C) 1992-1994 Richard Shrager
% Copyright (C) 1992-1994 Arthur Jutan
% Copyright (C) 1992-1994 Ray Muzic
%
% This program is free software; you can redistribute it and/or modify
% it under the terms of the GNU General Public License as published by
% the Free Software Foundation; either version 2 of the License, or
% (at your option) any later version.
%
% This program is distributed in the hope that it will be useful,
% but WITHOUT ANY WARRANTY; without even the implied warranty of
% MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
% GNU General Public License for more details.
%
% You should have received a copy of the GNU General Public License
% along with this program; If not, see <http://www.gnu.org/licenses/>.
function [f, p, cvg, iter, corp, covp, covr, stdresid, Z, r2] = ...
leasqr (x, y, pin, F, stol, niter, wt, dp, dFdp, options)
%% function [f, p, cvg, iter, corp, covp, covr, stdresid, Z, r2] =
%% leasqr (x, y, pin, F, {stol, niter, wt, dp, dFdp, options})
%%
%% Levenberg-Marquardt nonlinear regression of f(x,p) to y(x).
%%
%% Version 3.beta
%% Optional parameters are in braces {}.
%% x = column vector or matrix of independent variables, 1 row per
%% observation: x = [x0 x1....xm].
%% y = column vector of observed values, same number of rows as x.
%% wt = column vector (dim=length(x)) of statistical weights. These
%% should be set to be proportional to (sqrt of var(y))^-1; (That is,
%% the covariance matrix of the data is assumed to be proportional to
%% diagonal with diagonal equal to (wt.^2)^-1. The constant of
%% proportionality will be estimated.); default = ones(length(y),1).
%% pin = column vec of initial parameters to be adjusted by leasqr.
%% dp = fractional increment of p for numerical partial derivatives;
%% default = .001*ones(size(pin))
%% dp(j) > 0 means central differences on j-th parameter p(j).
%% dp(j) < 0 means one-sided differences on j-th parameter p(j).
%% dp(j) = 0 holds p(j) fixed i.e. leasqr wont change initial guess: pin(j)
%% F = name of function in quotes; the function shall be of the form y=f(x,p),
%% with y, x, p of the form y, x, pin as described above.
%% dFdp = name of partial derivative function in quotes; default is "dfdp", a
%% slow but general partial derivatives function; the function shall be
%% of the form prt=dfdp(x,f,p,dp,F) (see dfdp.m).
%% stol = scalar tolerance on fractional improvement in scalar sum of
%% squares = sum((wt.*(y-f))^2); default stol = .0001;
%% niter = scalar maximum number of iterations; default = 20;
%%
%% options = structure, currently recognized fields are 'fract_prec',
%% 'max_fract_change', and 'bounds'. For backwards compatibility,
%% 'options' can also be a matrix whose first and second column
%% contains the values of 'fract_prec' and 'max_fract_change',
%% respectively.
%%
%% Field 'options.fract_prec': column vector (same length as 'pin') of
%% desired fractional precisions in parameter estimates. Iterations
%% are terminated if change in parameter vector (chg) on two
%% consecutive iterations is less than their corresponding elements in
%% 'options.fract_prec'. [ie. all(abs(chg*current parm est) <
%% options.fract_prec) on two consecutive iterations.], default =
%% zeros().
%%
%% Field 'options.max_fract_change': column vector (same length as
%% 'pin) of maximum fractional step changes in parameter vector.
%% Fractional change in elements of parameter vector is constrained to
%% be at most 'options.max_fract_change' between sucessive iterations.
%% [ie. abs(chg(i))=abs(min([chg(i)
%% options.max_fract_change(i)*current param estimate])).], default =
%% Inf*ones().
%%
%% Field 'options.bounds': two-column-matrix, one row for each
%% parameter in 'pin'. Each row contains a minimal and maximal value
%% for each parameter. Default: [-Inf, Inf] in each row.
%%
%%
%% OUTPUT VARIABLES
%% f = column vector of values computed: f = F(x,p).
%% p = column vector trial or final parameters. i.e, the solution.
%% cvg = scalar: = 1 if convergence, = 0 otherwise.
%% iter = scalar number of iterations used.
%% corp = correlation matrix for parameters.
%% covp = covariance matrix of the parameters.
%% covr = diag(covariance matrix of the residuals).
%% stdresid = standardized residuals.
%% Z = matrix that defines confidence region (see comments in the source).
%% r2 = coefficient of multiple determination.
%%
%% All Zero guesses not acceptable
%% A modified version of Levenberg-Marquardt
%% Non-Linear Regression program previously submitted by R.Schrager.
%% This version corrects an error in that version and also provides
%% an easier to use version with automatic numerical calculation of
%% the Jacobian Matrix. In addition, this version calculates statistics
%% such as correlation, etc....
%%
%% Version 3 Notes
%% Errors in the original version submitted by Shrager (now called version 1)
%% and the improved version of Jutan (now called version 2) have been
corrected.
%% Additional features, statistical tests, and documentation have also been
%% included along with an example of usage. BEWARE: Some the the input and
%% output arguments were changed from the previous version.
%%
%% Ray Muzic <r...@ds2.uh.cwru.edu>
%% Arthur Jutan <ju...@charon.engga.uwo.ca>
%% Richard I. Shrager (301)-496-1122
%% Modified by A.Jutan (519)-679-2111
%% Modified by Ray Muzic 14-Jul-1992
%% 1) add maxstep feature for limiting changes in parameter estimates
%% at each step.
%% 2) remove forced columnization of x (x=x(:)) at beginning. x could be
%% a matrix with the ith row of containing values of the
%% independent variables at the ith observation.
%% 3) add verbose option
%% 4) add optional return arguments covp, stdresid, chi2
%% 5) revise estimates of corp, stdev
%% Modified by Ray Muzic 11-Oct-1992
%% 1) revise estimate of Vy. remove chi2, add Z as return values
%% Modified by Ray Muzic 7-Jan-1994
%% 1) Replace ones(x) with a construct that is compatible with versions
%% newer and older than v 4.1.
%% 2) Added global declaration of verbose (needed for newer than v4.x)
%% 3) Replace return value var, the variance of the residuals with covr,
%% the covariance matrix of the residuals.
%% 4) Introduce options as 10th input argument. Include
%% convergence criteria and maxstep in it.
%% 5) Correct calculation of xtx which affects coveraince estimate.
%% 6) Eliminate stdev (estimate of standard deviation of parameter
%% estimates) from the return values. The covp is a much more
%% meaningful expression of precision because it specifies a
confidence
%% region in contrast to a confidence interval.. If needed, however,
%% stdev may be calculated as stdev=sqrt(diag(covp)).
%% 7) Change the order of the return values to a more logical order.
%% 8) Change to more efficent algorithm of Bard for selecting epsL.
%% 9) Tighten up memory usage by making use of sparse matrices (if
%% MATLAB version >= 4.0) in computation of covp, corp, stdresid.
%% Modified by Francesco Potortì
%% for use in Octave
%% Added bounds feature to this file and to dfdp.m, Olaf Till 4-Aug-2009
%%
%% References:
%% Bard, Nonlinear Parameter Estimation, Academic Press, 1974.
%% Draper and Smith, Applied Regression Analysis, John Wiley and Sons, 1981.
%%
%%set default args
%% argument processing
%%
%% if (sscanf(version,'%f') >= 4),
vernum = sscanf(version,"%f");
if vernum(1) >= 4,
global verbose
plotcmd="plot(x(:,1),y,''+'',x(:,1),f); figure(gcf)";
else
plotcmd="plot(x(:,1),y,''+'',x(:,1),f); shg";
endif
if (exist("OCTAVE_VERSION"))
global verbose
plotcmd='plot(x(:,1),y,"+;data;",x(:,1),f,";fit;");';
endif
if(exist("verbose")~=1), %If verbose undefined, print nothing
verbose=0; %This will not tell them the results
endif
if (nargin <= 8), dFdp='dfdp'; endif
if (nargin <= 7), dp=.001*(pin*0+1); endif %DT
if (nargin <= 6), wt=ones(length(y),1); endif % SMB modification
if (nargin <= 5), niter=20; endif
if (nargin == 4), stol=.0001; endif
%%
y=y(:); wt=wt(:); pin=pin(:); dp=dp(:); %change all vectors to columns
%% check data vectors- same length?
m=length(y); n=length(pin); [m1,m2]=size(x);
if m1~=m ,error('input(x)/output(y) data must have same number of rows ')
,endif
%% processing of 'options'
pprec = zeros (n, 1);
maxstep = Inf * ones (n, 1);
bounds = cat (2, -Inf * ones (n, 1), Inf * ones (n, 1));
if (nargin > 9)
if (ismatrix (options)) % backwards compatibility
tp = options;
options = struct ("fract_prec", tp(:, 1));
if (columns (tp) > 1)
options.max_fract_change = tp(:, 2);
endif
endif
if (isfield (options, "fract_prec"))
pprec = options.fract_prec;
if (rows (pprec) != n || columns (pprec) != 1)
error ("fractional precisions: wrong dimensions");
endif
endif
if (isfield (options, "max_fract_change"))
maxstep = options.max_fract_change;
if (rows (maxstep) != n || columns (maxstep) != 1)
error ("maximum fractional step changes: wrong dimensions");
endif
endif
if (isfield (options, "bounds"))
bounds = options.bounds;
if (rows (bounds) != n || columns (bounds) != 2)
error ("bounds: wrong dimensions");
endif
idx = bounds(:, 1) > bounds(:, 2);
tp = bounds(idx, 2);
bounds(idx, 2) = bounds(idx, 1);
bounds(idx, 1) = tp;
if (any (idx = bounds(:, 1) == bounds(:, 2)))
warning ("lower and upper bounds identical for some parameters, setting
the respective elements of 'dp' to zero");
dp(idx) = 0;
endif
if (any (idx = pin < bounds(:, 1)))
warning ("some initial parameters set to lower bound");
pin(idx) = bounds(idx, 1);
endif
if (any (idx = pin > bounds(:, 2)))
warning ("some initial parameters set to upper bound");
pin(idx) = bounds(idx, 2);
endif
endif
endif
if (all (dp == 0))
error ("no free parameters");
endif
%% set up for iterations
p = pin;
f=feval(F,x,p); fbest=f; pbest=p;
r=wt.*(y-f);
ss=r'*r;
sbest=ss;
nrm=zeros(n,1);
chgprev=Inf*ones(n,1);
cvg=0;
epsLlast=1;
epstab=[.1, 1, 1e2, 1e4, 1e6];
%% do iterations
for iter=1:niter,
pprev=pbest;
prt=feval(dFdp,x,fbest,pprev,dp,F,bounds);
r=wt.*(y-fbest);
sprev=sbest;
sgoal=(1-stol)*sprev;
for j=1:n,
if dp(j)==0,
nrm(j)=0;
else
prt(:,j)=wt.*prt(:,j);
nrm(j)=prt(:,j)'*prt(:,j);
if nrm(j)>0,
nrm(j)=1/sqrt(nrm(j));
endif
endif
prt(:,j)=nrm(j)*prt(:,j);
endfor
%% above loop could ? be replaced by:
%% prt=prt.*wt(:,ones(1,n));
%% nrm=dp./sqrt(diag(prt'*prt));
%% prt=prt.*nrm(:,ones(1,m))';
[prt,s,v]=svd(prt,0);
s=diag(s);
g=prt'*r;
for jjj=1:length(epstab),
epsL = max(epsLlast*epstab(jjj),1e-7);
se=sqrt((s.*s)+epsL);
gse=g./se;
chg=((v*gse).*nrm);
%% check the change constraints and apply as necessary
ochg=chg;
idx = ~isinf(maxstep);
limit = abs(maxstep(idx).*pprev(idx));
chg(idx) = min(max(chg(idx),-limit),limit);
if (verbose && any(ochg ~= chg)),
disp(["Change in parameter(s): ", ...
sprintf("%d ", find(ochg ~= chg)), "were constrained"]);
endif
aprec=abs(pprec.*pbest); %---
%% ss=scalar sum of squares=sum((wt.*(y-f))^2).
if (any(abs(chg) > 0.1*aprec)),%--- % only worth evaluating function if
p=chg+pprev; % there is some non-miniscule change
%% apply bounds; preserving the direction of the attempted step
%% might lead to fixing _all_ parameters at their current value,
%% so decided not to do that and to simply reset parameters to
%% bounds
lidx = p < bounds(:, 1);
p(lidx) = bounds(lidx, 1);
uidx = p > bounds(:, 2);
p(uidx) = bounds(uidx, 2);
if (verbose && any (idx = lidx | uidx))
printf ("bounds apply for parameters number %s %i\n", \
sprintf ("%i, ", find (idx)(1:end - 1)), \
find (idx)(end));
endif
%%
f=feval(F,x,p);
r=wt.*(y-f);
ss=r'*r;
if ss<sbest,
pbest=p;
fbest=f;
sbest=ss;
endif
if ss<=sgoal,
break;
endif
endif %---
endfor
epsLlast = epsL;
if (verbose),
eval(plotcmd);
endif
if ss<eps,
break;
endif
aprec=abs(pprec.*pbest);
%% [aprec, chg, chgprev]
if (all(abs(chg) < aprec) && all(abs(chgprev) < aprec)),
cvg=1;
if (verbose),
fprintf("Parameter changes converged to specified precision\n");
endif
break;
else
chgprev=chg;
endif
if ss>sgoal,
break;
endif
endfor
%% set return values
p=pbest;
f=fbest;
ss=sbest;
cvg=((sbest>sgoal) || (sbest<=eps) || cvg);
if cvg ~= 1 , disp(" CONVERGENCE NOT ACHIEVED! "), endif
%% CALC VARIANCE COV MATRIX AND CORRELATION MATRIX OF PARAMETERS
%% re-evaluate the Jacobian at optimal values
jac=feval(dFdp,x,f,p,dp,F,bounds);
msk = dp ~= 0;
n = sum(msk); % reduce n to equal number of estimated parameters
jac = jac(:, msk); % use only fitted parameters
%% following section is Ray Muzic's estimate for covariance and correlation
%% assuming covariance of data is a diagonal matrix proportional to
%% diag(1/wt.^2).
%% cov matrix of data est. from Bard Eq. 7-5-13, and Row 1 Table 5.1
if exist('sparse') % save memory
Q=sparse(1:m,1:m,1./wt.^2);
Qinv=sparse(1:m,1:m,wt.^2);
else
Q=diag((0*wt+1)./(wt.^2));
Qinv=diag(wt.*wt);
endif
resid=y-f; %un-weighted residuals
covr=resid'*Qinv*resid*Q/(m-n); %covariance of residuals
Vy=1/(1-n/m)*covr; % Eq. 7-13-22, Bard %covariance of the data
jtgjinv=inv(jac'*Qinv*jac); %argument of inv may be singular
covp=jtgjinv*jac'*Qinv*Vy*Qinv*jac*jtgjinv; % Eq. 7-5-13, Bard %cov of parm
est
d=sqrt(abs(diag(covp)));
corp=covp./(d*d');
if exist('sparse')
covr=spdiags(covr,0);
stdresid=resid./sqrt(spdiags(Vy,0));
else
covr=diag(covr); % convert returned values to compact
storage
stdresid=resid./sqrt(diag(Vy)); % compute then convert for compact storage
endif
Z=((m-n)*jac'*Qinv*jac)/(n*resid'*Qinv*resid);
%%% alt. est. of cov. mat. of parm.:(Delforge, Circulation, 82:1494-1504, 1990
%%disp('Alternate estimate of cov. of param. est.')
%%acovp=resid'*Qinv*resid/(m-n)*jtgjinv
%%Calculate R^2 (Ref Draper & Smith p.46)
r=corrcoef([y(:),f(:)]);
r2=r(1,2).^2;
%% if someone has asked for it, let them have it
if (verbose),
eval(plotcmd);
disp(" Least Squares Estimates of Parameters")
disp(p')
disp(" Correlation matrix of parameters estimated")
disp(corp)
disp(" Covariance matrix of Residuals")
disp(covr)
disp(" Correlation Coefficient R^2")
disp(r2)
sprintf(" 95%% conf region: F(0.05)(%.0f,%.0f)>=
delta_pvec'*Z*delta_pvec",n,m-n)
Z
%% runs test according to Bard. p 201.
n1 = sum((f-y) < 0);
n2 = sum((f-y) > 0);
nrun=sum(abs(diff((f-y)<0)))+1;
if ((n1>10)&(n2>10)), % sufficent data for test?
zed=(nrun-(2*n1*n2/(n1+n2)+1)+0.5)/(2*n1*n2*(2*n1*n2-n1-n2)...
/((n1+n2)^2*(n1+n2-1)));
if (zed < 0),
prob = erfc(-zed/sqrt(2))/2*100;
disp([num2str(prob),"% chance of fewer than ",num2str(nrun)," runs."]);
else
prob = erfc(zed/sqrt(2))/2*100;
disp([num2str(prob),"% chance of greater than ",num2str(nrun),"
runs."]);
endif
endif
endif
------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day
trial. Simplify your report design, integration and deployment - and focus on
what you do best, core application coding. Discover what's new with
Crystal Reports now. http://p.sf.net/sfu/bobj-july
_______________________________________________
Octave-dev mailing list
Octave-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/octave-dev