Dear R help list,
This is a long post so apologies in advance. I am estimating a model with the
mgcv package, which has several covariates both linear and smooth terms. For 1
or 2 of these smooth terms, I "know" that the truth is monotonic and downward
sloping. I am aware that a new package "scam" exists for this kind of thing,
but I am in the unfortunate situation that I am not allowed to install new
packages on the remote desktop I am working on.
Fortunately, mgcv allows imposing constraints to ensure monotonicity but I am
having some difficulty understanding the code to do it.
I have found the example in ?pcps() showing how to use finite differencing in a
generated data set.
My first question is this: If I am using a real data set, do I generate data
with a monotonic function to use for the constraints or do I try to find a
sufficiently monotonic other covariate to use? Does size matter as long as the
sign of the first difference is correct? I am assuming that if I generate data
for this purpose the dimension should be the same so the dimension of the
matrices from finite differencing are the same?
A couple of (perhaps quite basic) specific questions to the example code:
## Preliminary unconstrained gam fit...
G <- gam(y~s(x)+s(z)+s(v,k=20),fit=FALSE)
So first create G which is going to be the input to pcls() to fit the
constrained model.
Then fit the unconstrained version:
b <- gam(G=G)
(skipping this part of the example that calculates finite differences contained
in Xx and Xz where Xx is always positive, and Xz is not - z is the covariate
that we want to apply the constraint to
)
Now the constraint is being defined:
G$Ain <- rbind(Xx,Xz) ## inequality constraint matrix
G$bin <- rep(0,nrow(G$Ain))
G$sp <- b$sp
G$p <- coef(b)
G$off <- G$off-1 ## to match what pcls is expecting
## force inital parameters to meet constraint
G$p[11:18] <- G$p[2:9]<- 0
This part replaces the coefficients of the smooth components and the one with
always positive finite differences with zero. The initial parameter values for
pcls should satisfy the inequality constraint but not with equality according
to the help file
I thought the inequality constraint here was that the
coefficients must be larger or equal to zero - I must be missing something?
I guess the dimension (8 parameters) is simply due to k=8 as the default, i.e.
if one sets k larger or smaller, the dimension here changes correspondingly
p <- pcls(G) ## constrained fit
par(mfrow=c(2,3))
plot(b) ## original fit
b$coefficients <- p
plot(b) ## constrained fit
## note that standard errors in preceding plot are obtained from
## unconstrained fit
Once I have replaced the coefficients in the model b with the constrained
coefficients, I suppose I can use all the usual tools to get first differences
with the constrained model and get the slope estimates of the constrained
smooth term?
If I want to impose constraints on two terms, can I then do this just by adding
the constraints at once, i.e. adding more columns to G$Ain?
Any help is much appreciated!
Kind regards,
Kathrine
[[alternative HTML version deleted]]
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.