[R] bsts posterior distributions

2019-01-11 Thread Greg Dropkin
hello

I couldn't find previous posts to answer this, but please point me to any.

I am trying to understand bsts, starting with no regressors

Here is code which appears to mimic bsts, producing graphs similar to the
model plot, but gives a rather different posterior distribution for the
parameters. I may misunderstand SdPrior, or perhaps the differences are
with mcmc.

Thanks for any help

Greg

###

library(bsts)
library(Boom)
library(mcmc)

#simulate some data

y<-rep(NA,50)

y[1]=1
y[2]=1
s=2
set.seed(5)
for (k in 1:48)
{
y[2+k]=y[1+k]+0.1*y[k]+s*rnorm(1)
}
plot(1:50,y[1:50],main=paste("seed =",j))

#bsts model

ss<-AddLocalLevel(list(),y)
mod1<-bsts(y,state.specification=ss,niter=1000)
plot(mod1)

#reproduce the plot using mod1$state.contributions

par(mfrow=c(1,2))
plot(1:50,y,col="blue",ylim=c(-12,8),main="quantiles of
mod1$state.contributions")
for (i in 1:99)
{
qi<-qin<-rep(NA,50)
tj<-1:50
for (j in 1:50)
{
qi[j]<-quantile(mod1$state.con[501:1000,1,j],(i-0.5)/100)
qin[j]<-quantile(mod1$state.con[501:1000,1,j],(i+1-0.5)/100)
}
polygon(c(tj,rev(tj)),c(qi[1:50],rev(qin[1:50])),col=rgb(0,0,0,45*dnorm(i,mean=50,sd=18)),border=FALSE)
}
plot(mod1,ylim=c(-12,8),main="plot(mod1)")
par(mfrow=c(1,1))

#the state specification is based on sd(y)

sd(y)
ss

#the likelihood, using the kalman filter, as a function of the error sd's

kal<-function(par)
{
a<-par[1]
b<-par[2]

H=matrix(1,1,1)
F=matrix(1,1,1)
#1-dimensional state
N=50
dim(y)=c(1,N)

xe<-ye<-ze<-matrix(NA,1,N)
w<-rnorm(1)
xe[,1]<-1
ye[,1]<-H%*%xe[,1]
ze[,1]<-y[,1]-ye[,1]

P<-K<-array(data=NA,dim=c(N,1,1))
#P[1,,] initial guess
P[1,,]<-1
K[1,,]<-1
xe[1,1]<-1

for (i in 1:(N-1))
{
P[i+1,,]<-F%*%P[i,,]%*%t(F)+a^2
K[i+1,,]<-P[i+1,,]%*%t(H)%*%solve(b^2+H%*%P[i+1,,]%*%t(H))
xe[1,i+1]<-F%*%xe[,i]+K[i+1,,]%*%(y[,i+1]-H%*%F%*%xe[,i])
P[i+1,,]<-(diag(1,1)-K[i+1,,]%*%H)%*%P[i+1,,]
}

-1/2*log(b^2)-1/2*sum(((y[1,]-xe[1,])/b)^2)
}

#independent priors on a and b
#bsts uses sd(y) to set both priors.
#for a (ss[[1]]$sigma.prior) it uses SdPrior with
#$prior.guess 0.04600655, $prior.df 0.01, $initial.value 0.04600655,
$upper.limit 4.600655
#for b (mod1$prior) it uses SdPrior with
#$prior.guess 4.600655, $prior.df 0.01, $initial.value 4.600655,
$upper.limit 5.520786

#try an inverse gamma prior (on a^2, then converted to a prior on a)

lpr1<-function(a)
{
v=0.01
ifelse((a<=0)|a>sd(y),-Inf,-(v/2+1)*log(a^2)-v*(sd(y)/100)^2/(2*a^2)+1/2*log(a^2))
}

lpr2<-function(b)
{
v=0.01
ifelse((b<=0)|b>1.2*sd(y),-Inf,-(v/2+1)*log(b^2)-v*(sd(y))^2/(2*b^2)+1/2*log(b^2))
}

lpost1<-function(par)
{
a<-par[1]
b<-par[2]
lpr1(a)+lpr2(b)+kal(par)
}

nlpost1<-function(par) -lpost1(par)

pop<-optim(c(2,1),nlpost1)
pop<-optim(pop$par,nlpost1,hessian=T)
pop

lpost1m<-function(par) lpost1(par+pop$par)
nlpost1m<-function(par) -lpost1m(par)
pop1m<-optim(c(0,0),nlpost1m)
pop1m<-optim(pop1m$par,nlpost1m,hessian=T)
pop1m

sc<-chol(pop1m$hess)
sd<-diag(sc)
nb=5000
#with batches, spacing, ~5 min run
outm<-metrop(lpost1m,1e-2*sd,nb,blen=5,nspac=5)
samm<-outm$batch
dim(samm)
acf(samm)
samm<-thin(samm,5)
dim(samm)
acf(samm)
par(mfrow=c(2,1))
plot(samm[501:1000,1],type="l")
plot(samm[501:1000,2],type="l")
par(mfrow=c(1,1))

#plot kalman using samm from lpost1m

ax<-samm[501:1000,1]+pop$par[1]
bx<-samm[501:1000,2]+pop$par[2]

state<-matrix(NA,500,50)

for (j in 1:500)
{
a<-ax[j]
b<-bx[j]

H=matrix(1,1,1)
F=matrix(1,1,1)
N=50
dim(y)=c(1,N)

xe<-ye<-ze<-matrix(NA,1,N)
xe[,1]<-1
ye[,1]<-H%*%xe[,1]
ze[,1]<-y[,1]-ye[,1]

P<-K<-array(data=NA,dim=c(N,1,1))
P[1,,]<-0
for (i in 1:(N-1))
{
P[i+1,,]<-F%*%P[i,,]%*%t(F)+a^2
K[i+1,,]<-P[i+1,,]%*%t(H)%*%solve(b^2+H%*%P[i+1,,]%*%t(H))
P[i+1,,]<-(diag(1,1)-K[i+1,,]%*%H)%*%P[i+1,,]
xe[1,i+1]<-F%*%xe[,i]+K[i+1,,]%*%(y[,i+1]-H%*%F%*%xe[,i])
}

P[1,,]<-P[2,,]
state[j,]<-rnorm(N,xe[1,1:N],P[1:N,,]^0.5)
}

par(mfrow=c(1,2))
plot(1:N,y,col="blue",ylim=c(-12,8))
for (i in 1:99)
{
qi<-qin<-rep(NA,N)
tj<-1:N
for (k in 1:N)
{
qi[k]<-quantile(state[,k],(i-0.5)/100)
qin[k]<-quantile(state[,k],(i+1-0.5)/100)
}
polygon(c(tj,rev(tj)),c(qi[1:N],rev(qin[1:N])),col=rgb(0,0,0,45*dnorm(i,mean=50,sd=18)),border=FALSE)
}
plot(mod1,ylim=c(-12,8))
par(mfrow=c(1,1))

#very plausible, but

mean(ax)
mean(mod1$sigma.level[501:1000])

mean(bx)
mean(mod1$sigma.obs[501:1000])

par(mfrow=c(1,2))
qqplot(ax,mod1$sigma.level[501:1000])
lines(ax,ax,col="red")
qqplot(bx,mod1$sigma.obs[501:1000])
lines(bx,bx,col="red")
par(mfrow=c(1,1))

#ax not sampling sigma.level
#bx not quite sampling sigma.obs

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] mgcv: distribution of dev with Poisson data

2014-02-05 Thread Greg Dropkin
thanks Simon

also, it appears at least with ML that the default scale estimate with
quasipoisson (i.e. using dev) is the scale which minimises the ML value of
the fitted model. So it is the best model but doesn't actually give the
correct mean-variance relation. Is that right?

thanks again

Greg

 Greg,

 The deviance being chi^2 distributed on the residual degrees of freedom
 works in some cases (mostly where the response itself can be reasonably
 approximated as Gaussian), but rather poorly in others (noteably low
 count data). This is what you are seeing in your simulations - in the
 first case the Poisson mean is relatively high - so the  normal
 approximation to the Poisson is not too bad and the deviance is approx.
 chi.sq on residual df. In the second case the Poisson mean is low, there
 are lots of zeroes, and the approximation breaks down. So yes, the
 Pearson estimator is probably a better bet in the latter case.

 best,
 Simom

 ps. Note that the chi squared approximation for the *difference* in
 deviance between two nested models does not suffer from this problem.


 On 04/02/14 09:25, Greg Dropkin wrote:
 mgcv: distribution of dev

 hi

 I can't tell if this is a simple error.
 I'm puzzled by the distribution of dev when fitting a gam to Poisson
 generated data.
 I expected dev to be approximately chi-squared on residual d.f., i.e.
 about 1000 in each case below.
 In particular, the low values in the 3rd and 4th versions would suggest
 scale  1, yet the data is Poisson generated.
 The problem isn't caused by insufficient k values in the smoother.
 Does this mean that with sparse data the distribution of dev is no
 longer
 approx chi-sq on residual df?
 Does it mean the scale estimate when fitting quasipoisson should be the
 Pearson version?

 thanks

 greg

 library(mgcv)
 set.seed(1)
 x1-runif(1000)
 linp-2+exp(-2*x1)*sin(8*x1)
 sum(exp(linp))
 dev1-dev2-sums-rep(0,20)
 for (j in 1:20)
 {
 y-rpois(1000,exp(linp))
 sums[j]-sum(y)
 m1-gam(y~s(x1),family=poisson)
 m2-gam(y~s(x1,k=20),family=poisson)
 dev1[j]=m1$dev
 dev2[j]=m2$dev
 }
 mean(sums)
 sd(sums)
 mean(dev1)
 sd(dev1)
 mean(dev2)
 sd(dev2)

 #dev slighly higher than expected

 linpa--1+exp(-2*x1)*sin(8*x1)
 sum(exp(linpa))
 dev1a-dev2a-sumsa-rep(0,20)
 for (j in 1:20)
 {
 y-rpois(1000,exp(linpa))
 sumsa[j]-sum(y)
 m1-gam(y~s(x1),family=poisson)
 m2-gam(y~s(x1,k=20),family=poisson)
 dev1a[j]=m1$dev
 dev2a[j]=m2$dev
 }
 mean(sumsa)
 sd(sumsa)
 mean(dev1a)
 sd(dev1a)
 mean(dev2a)
 sd(dev2a)

 #dev a bit lower than expected

 linpb--1.5+exp(-2*x1)*sin(8*x1)
 sum(exp(linpb))
 dev1b-dev2b-sumsb-rep(0,20)
 for (j in 1:20)
 {
 y-rpois(1000,exp(linpb))
 sumsb[j]-sum(y)
 m1-gam(y~s(x1),family=poisson)
 m2-gam(y~s(x1,k=20),family=poisson)
 dev1b[j]=m1$dev
 dev2b[j]=m2$dev
 }
 mean(sumsb)
 sd(sumsb)
 mean(dev1b)
 sd(dev1b)
 mean(dev2b)
 sd(dev2b)

 #dev much lower than expected

 m1q-gam(y~s(x1),family=quasipoisson,scale=-1)
 m1q$scale
 m1q$dev/(1000-sum(m1q$edf))

 #scale estimate  1

 sum((y-fitted(m1q))^2/fitted(m1q))/(1000-sum(m1q$edf))

 #Pearson estimate of scale closer, but also  1


 linpc--2+exp(-2*x1)*sin(8*x1)
 sum(exp(linpc))
 dev1c-dev2c-sumsc-rep(0,20)
 for (j in 1:20)
 {
 y-rpois(1000,exp(linpc))
 sumsc[j]-sum(y)
 m1-gam(y~s(x1),family=poisson)
 m2-gam(y~s(x1,k=20),family=poisson)
 dev1c[j]=m1$dev
 dev2c[j]=m2$dev
 }
 mean(sumsc)
 sd(sumsc)
 mean(dev1c)
 sd(dev1c)
 mean(dev2c)
 sd(dev2c)

 #dev much lower than expected

 m1q-gam(y~s(x1),family=quasipoisson,scale=-1)
 m1q$scale
 m1q$sig2
 m1q$dev/(1000-sum(m1q$edf))

 #scale estimate  1

 sum((y-fitted(m1q))^2/fitted(m1q))/(1000-sum(m1q$edf))

 #Pearson estimate of scale ok




 --
 Simon Wood, Mathematical Science, University of Bath BA2 7AY UK
 +44 (0)1225 386603   http://people.bath.ac.uk/sw283



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] mgcv: distribution of dev with Poisson data

2014-02-05 Thread Greg Dropkin
hi Simon

yes, I also got the right shape of the mean-variance relation but the
wrong estimate of the parameter.

thanks very much

Greg

 Hi Greg,

 Yes, this sounds right - with quasipoisson gam uses `extended
 quasi-likelihood' (see McCullagh and Nelder's GLM book) to allow
 estimation of the scale parameter along with the smoothing parameters
 via (RE)ML, and it could well be that this gives a biased scale estimate
 with low counts (although the shape of the mean variance relationship is
 unaffected by this).

 It might well be more sensible to default to a Pearson estimate of the
 scale parameter for 'gam' ('bam' and 'gamm' already do this), to avoid
 this issue with quasipoisson and low counts. Your email reminded me that
 someone had reported rather too high p-values for quasipoisson with
 these sort of data, and looking back at my list of open issues I see
 that `someone' was you as well! It's possible that moving to a Pearson
 estimator of the scale will solve that problem too.

 Thanks for this... very helpful.

 best,
 Simon


 On 05/02/14 12:56, Greg Dropkin wrote:
 thanks Simon

 also, it appears at least with ML that the default scale estimate with
 quasipoisson (i.e. using dev) is the scale which minimises the ML value
 of
 the fitted model. So it is the best model but doesn't actually give
 the
 correct mean-variance relation. Is that right?

 thanks again

 Greg

 Greg,

 The deviance being chi^2 distributed on the residual degrees of freedom
 works in some cases (mostly where the response itself can be reasonably
 approximated as Gaussian), but rather poorly in others (noteably low
 count data). This is what you are seeing in your simulations - in the
 first case the Poisson mean is relatively high - so the  normal
 approximation to the Poisson is not too bad and the deviance is approx.
 chi.sq on residual df. In the second case the Poisson mean is low,
 there
 are lots of zeroes, and the approximation breaks down. So yes, the
 Pearson estimator is probably a better bet in the latter case.

 best,
 Simom

 ps. Note that the chi squared approximation for the *difference* in
 deviance between two nested models does not suffer from this problem.


 On 04/02/14 09:25, Greg Dropkin wrote:
 mgcv: distribution of dev

 hi

 I can't tell if this is a simple error.
 I'm puzzled by the distribution of dev when fitting a gam to Poisson
 generated data.
 I expected dev to be approximately chi-squared on residual d.f., i.e.
 about 1000 in each case below.
 In particular, the low values in the 3rd and 4th versions would
 suggest
 scale  1, yet the data is Poisson generated.
 The problem isn't caused by insufficient k values in the smoother.
 Does this mean that with sparse data the distribution of dev is no
 longer
 approx chi-sq on residual df?
 Does it mean the scale estimate when fitting quasipoisson should be
 the
 Pearson version?

 thanks

 greg

 library(mgcv)
 set.seed(1)
 x1-runif(1000)
 linp-2+exp(-2*x1)*sin(8*x1)
 sum(exp(linp))
 dev1-dev2-sums-rep(0,20)
 for (j in 1:20)
 {
 y-rpois(1000,exp(linp))
 sums[j]-sum(y)
 m1-gam(y~s(x1),family=poisson)
 m2-gam(y~s(x1,k=20),family=poisson)
 dev1[j]=m1$dev
 dev2[j]=m2$dev
 }
 mean(sums)
 sd(sums)
 mean(dev1)
 sd(dev1)
 mean(dev2)
 sd(dev2)

 #dev slighly higher than expected

 linpa--1+exp(-2*x1)*sin(8*x1)
 sum(exp(linpa))
 dev1a-dev2a-sumsa-rep(0,20)
 for (j in 1:20)
 {
 y-rpois(1000,exp(linpa))
 sumsa[j]-sum(y)
 m1-gam(y~s(x1),family=poisson)
 m2-gam(y~s(x1,k=20),family=poisson)
 dev1a[j]=m1$dev
 dev2a[j]=m2$dev
 }
 mean(sumsa)
 sd(sumsa)
 mean(dev1a)
 sd(dev1a)
 mean(dev2a)
 sd(dev2a)

 #dev a bit lower than expected

 linpb--1.5+exp(-2*x1)*sin(8*x1)
 sum(exp(linpb))
 dev1b-dev2b-sumsb-rep(0,20)
 for (j in 1:20)
 {
 y-rpois(1000,exp(linpb))
 sumsb[j]-sum(y)
 m1-gam(y~s(x1),family=poisson)
 m2-gam(y~s(x1,k=20),family=poisson)
 dev1b[j]=m1$dev
 dev2b[j]=m2$dev
 }
 mean(sumsb)
 sd(sumsb)
 mean(dev1b)
 sd(dev1b)
 mean(dev2b)
 sd(dev2b)

 #dev much lower than expected

 m1q-gam(y~s(x1),family=quasipoisson,scale=-1)
 m1q$scale
 m1q$dev/(1000-sum(m1q$edf))

 #scale estimate  1

 sum((y-fitted(m1q))^2/fitted(m1q))/(1000-sum(m1q$edf))

 #Pearson estimate of scale closer, but also  1


 linpc--2+exp(-2*x1)*sin(8*x1)
 sum(exp(linpc))
 dev1c-dev2c-sumsc-rep(0,20)
 for (j in 1:20)
 {
 y-rpois(1000,exp(linpc))
 sumsc[j]-sum(y)
 m1-gam(y~s(x1),family=poisson)
 m2-gam(y~s(x1,k=20),family=poisson)
 dev1c[j]=m1$dev
 dev2c[j]=m2$dev
 }
 mean(sumsc)
 sd(sumsc)
 mean(dev1c)
 sd(dev1c)
 mean(dev2c)
 sd(dev2c)

 #dev much lower than expected

 m1q-gam(y~s(x1),family=quasipoisson,scale=-1)
 m1q$scale
 m1q$sig2
 m1q$dev/(1000-sum(m1q$edf))

 #scale estimate  1

 sum((y-fitted(m1q))^2/fitted(m1q))/(1000-sum(m1q$edf))

 #Pearson estimate of scale ok




 --
 Simon Wood, Mathematical Science, University of Bath BA2 7AY UK
 +44 (0)1225 386603   http://people.bath.ac.uk/sw283






 --
 Simon Wood, Mathematical Science, University of Bath BA2 7AY UK
 +44 (0)1225

Re: [R] mgcv: distribution of dev with Poisson data

2014-02-05 Thread Greg Dropkin
Dear Prof Ripley

yes, but if the estimate is biased it's good to know what the bias is.

The problem illustrated in the simulations has nothing to do with ML,
though, as the default fitting method in mgcv when scale is unknown is
GCV and that is what was used, by default, here.

The point about ML was that I'd looked separately at what happens when
scale varies in fitting a quasipoisson model, and found that ML was
minimised (in an example) at the scale estimate (i.e. the default estimate
using dev). A very brief look at the code made that seem plausible as it
does involve the derivatives with respect to scale.

In my limited experience ML is actually better behaved than GCV or REML,
giving more conservative estimates and tighter CIs which behave better
under simulation.

re the Pearson estimates being preferable, that is also what Simon Wood's
book says on p172, another reason why I was puzzled that the default
estimate used dev instead.

thanks

Greg

 On 05/02/2014 12:56, Greg Dropkin wrote:
 thanks Simon
 also, it appears at least with ML that the default scale estimate with
quasipoisson (i.e. using dev) is the scale which minimises the ML value
of
 the fitted model. So it is the best model but doesn't actually give the
 correct mean-variance relation. Is that right?

 Well, estimates do not usually give the correct value 

 But ML is biased in many cases, sometimes badly so here, and most people
prefer the Pearson estimator of the dispersion.  This is McCullagn 
Nelder's book, even the first edition, but without much evidence:
Venables  Ripley (2002, ยง7.5) has plots to back up their hunch.

 thanks again
 Greg
 Greg,
 The deviance being chi^2 distributed on the residual degrees of
freedom
 works in some cases (mostly where the response itself can be
reasonably
 approximated as Gaussian), but rather poorly in others (noteably low
count data). This is what you are seeing in your simulations - in the
first case the Poisson mean is relatively high - so the  normal
approximation to the Poisson is not too bad and the deviance is
approx.
 chi.sq on residual df. In the second case the Poisson mean is low, there
 are lots of zeroes, and the approximation breaks down. So yes, the
Pearson estimator is probably a better bet in the latter case. best,
 Simom
 ps. Note that the chi squared approximation for the *difference* in
deviance between two nested models does not suffer from this problem.
On 04/02/14 09:25, Greg Dropkin wrote:
 mgcv: distribution of dev
 hi
 I can't tell if this is a simple error.
 I'm puzzled by the distribution of dev when fitting a gam to Poisson
generated data.
 I expected dev to be approximately chi-squared on residual d.f., i.e.
about 1000 in each case below.
 In particular, the low values in the 3rd and 4th versions would suggest
 scale  1, yet the data is Poisson generated.
 The problem isn't caused by insufficient k values in the smoother.
Does this mean that with sparse data the distribution of dev is no
longer
 approx chi-sq on residual df?
 Does it mean the scale estimate when fitting quasipoisson should be the
 Pearson version?
 thanks
 greg
 library(mgcv)
 set.seed(1)
 x1-runif(1000)
 linp-2+exp(-2*x1)*sin(8*x1)
 sum(exp(linp))
 dev1-dev2-sums-rep(0,20)
 for (j in 1:20)
 {
 y-rpois(1000,exp(linp))
 sums[j]-sum(y)
 m1-gam(y~s(x1),family=poisson)
 m2-gam(y~s(x1,k=20),family=poisson)
 dev1[j]=m1$dev
 dev2[j]=m2$dev
 }
 mean(sums)
 sd(sums)
 mean(dev1)
 sd(dev1)
 mean(dev2)
 sd(dev2)
 #dev slighly higher than expected
 linpa--1+exp(-2*x1)*sin(8*x1)
 sum(exp(linpa))
 dev1a-dev2a-sumsa-rep(0,20)
 for (j in 1:20)
 {
 y-rpois(1000,exp(linpa))
 sumsa[j]-sum(y)
 m1-gam(y~s(x1),family=poisson)
 m2-gam(y~s(x1,k=20),family=poisson)
 dev1a[j]=m1$dev
 dev2a[j]=m2$dev
 }
 mean(sumsa)
 sd(sumsa)
 mean(dev1a)
 sd(dev1a)
 mean(dev2a)
 sd(dev2a)
 #dev a bit lower than expected
 linpb--1.5+exp(-2*x1)*sin(8*x1)
 sum(exp(linpb))
 dev1b-dev2b-sumsb-rep(0,20)
 for (j in 1:20)
 {
 y-rpois(1000,exp(linpb))
 sumsb[j]-sum(y)
 m1-gam(y~s(x1),family=poisson)
 m2-gam(y~s(x1,k=20),family=poisson)
 dev1b[j]=m1$dev
 dev2b[j]=m2$dev
 }
 mean(sumsb)
 sd(sumsb)
 mean(dev1b)
 sd(dev1b)
 mean(dev2b)
 sd(dev2b)
 #dev much lower than expected
 m1q-gam(y~s(x1),family=quasipoisson,scale=-1)
 m1q$scale
 m1q$dev/(1000-sum(m1q$edf))
 #scale estimate  1
 sum((y-fitted(m1q))^2/fitted(m1q))/(1000-sum(m1q$edf))
 #Pearson estimate of scale closer, but also  1
 linpc--2+exp(-2*x1)*sin(8*x1)
 sum(exp(linpc))
 dev1c-dev2c-sumsc-rep(0,20)
 for (j in 1:20)
 {
 y-rpois(1000,exp(linpc))
 sumsc[j]-sum(y)
 m1-gam(y~s(x1),family=poisson)
 m2-gam(y~s(x1,k=20),family=poisson)
 dev1c[j]=m1$dev
 dev2c[j]=m2$dev
 }
 mean(sumsc)
 sd(sumsc)
 mean(dev1c)
 sd(dev1c)
 mean(dev2c)
 sd(dev2c)
 #dev much lower than expected
 m1q-gam(y~s(x1),family=quasipoisson,scale=-1)
 m1q$scale
 m1q$sig2
 m1q$dev/(1000-sum(m1q$edf))
 #scale estimate  1
 sum((y-fitted(m1q))^2/fitted(m1q))/(1000-sum(m1q$edf))
 #Pearson estimate of scale ok
 --
 Simon Wood

[R] mgcv: distribution of dev with Poisson data

2014-02-04 Thread Greg Dropkin
mgcv: distribution of dev

hi

I can't tell if this is a simple error.
I'm puzzled by the distribution of dev when fitting a gam to Poisson
generated data.
I expected dev to be approximately chi-squared on residual d.f., i.e.
about 1000 in each case below.
In particular, the low values in the 3rd and 4th versions would suggest
scale  1, yet the data is Poisson generated.
The problem isn't caused by insufficient k values in the smoother.
Does this mean that with sparse data the distribution of dev is no longer
approx chi-sq on residual df?
Does it mean the scale estimate when fitting quasipoisson should be the
Pearson version?

thanks

greg

library(mgcv)
set.seed(1)
x1-runif(1000)
linp-2+exp(-2*x1)*sin(8*x1)
sum(exp(linp))
dev1-dev2-sums-rep(0,20)
for (j in 1:20)
{
y-rpois(1000,exp(linp))
sums[j]-sum(y)
m1-gam(y~s(x1),family=poisson)
m2-gam(y~s(x1,k=20),family=poisson)
dev1[j]=m1$dev
dev2[j]=m2$dev
}
mean(sums)
sd(sums)
mean(dev1)
sd(dev1)
mean(dev2)
sd(dev2)

#dev slighly higher than expected

linpa--1+exp(-2*x1)*sin(8*x1)
sum(exp(linpa))
dev1a-dev2a-sumsa-rep(0,20)
for (j in 1:20)
{
y-rpois(1000,exp(linpa))
sumsa[j]-sum(y)
m1-gam(y~s(x1),family=poisson)
m2-gam(y~s(x1,k=20),family=poisson)
dev1a[j]=m1$dev
dev2a[j]=m2$dev
}
mean(sumsa)
sd(sumsa)
mean(dev1a)
sd(dev1a)
mean(dev2a)
sd(dev2a)

#dev a bit lower than expected

linpb--1.5+exp(-2*x1)*sin(8*x1)
sum(exp(linpb))
dev1b-dev2b-sumsb-rep(0,20)
for (j in 1:20)
{
y-rpois(1000,exp(linpb))
sumsb[j]-sum(y)
m1-gam(y~s(x1),family=poisson)
m2-gam(y~s(x1,k=20),family=poisson)
dev1b[j]=m1$dev
dev2b[j]=m2$dev
}
mean(sumsb)
sd(sumsb)
mean(dev1b)
sd(dev1b)
mean(dev2b)
sd(dev2b)

#dev much lower than expected

m1q-gam(y~s(x1),family=quasipoisson,scale=-1)
m1q$scale
m1q$dev/(1000-sum(m1q$edf))

#scale estimate  1

sum((y-fitted(m1q))^2/fitted(m1q))/(1000-sum(m1q$edf))

#Pearson estimate of scale closer, but also  1


linpc--2+exp(-2*x1)*sin(8*x1)
sum(exp(linpc))
dev1c-dev2c-sumsc-rep(0,20)
for (j in 1:20)
{
y-rpois(1000,exp(linpc))
sumsc[j]-sum(y)
m1-gam(y~s(x1),family=poisson)
m2-gam(y~s(x1,k=20),family=poisson)
dev1c[j]=m1$dev
dev2c[j]=m2$dev
}
mean(sumsc)
sd(sumsc)
mean(dev1c)
sd(dev1c)
mean(dev2c)
sd(dev2c)

#dev much lower than expected

m1q-gam(y~s(x1),family=quasipoisson,scale=-1)
m1q$scale
m1q$sig2
m1q$dev/(1000-sum(m1q$edf))

#scale estimate  1

sum((y-fitted(m1q))^2/fitted(m1q))/(1000-sum(m1q$edf))

#Pearson estimate of scale ok

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] gam and optim

2013-09-22 Thread Greg Dropkin
just to clarify how I see the error, it was the mis-definition of the
penalty term in the function dv. The following code corrects this error.
What is actually being minimised at this step is the penalised deviance
conditional on the smoothing parameter. A second issue is that the optim
default (Nelder-Mead) would have to be recycled several times to
approximate the minimum obtained by gam, however, in this example, BFGS
gets it straight away.

sorry for all the confusion!

greg

set.seed(1)
library(mgcv)
x-runif(100)
lp-exp(-2*x)*sin(8*x)
y-rpois(100,exp(lp))
m1-gam(y~s(x),poisson)
W-diag(fitted(m1))
X-predict(m1,type=lpmatrix)
S-m1$smooth[[1]]$S[[1]]
S-rbind(0,cbind(0,S))
A-X%*%solve(t(X)%*%W%*%X+m1$sp*S)%*%t(X)%*%W
sum(diag(A))
sum(m1$edf)
fit-fitted(m1)
b-m1$coef
range(exp(X%*%b)-fit)
z-y/fit-1+X%*%b
range(A%*%z-X%*%b)

s-m1$sp
dv-function(t)
{
f-exp(X%*%t)
-2*sum(y*log(f)-f-ifelse(y==0,0,y*log(y))+y)+s*t%*%S%*%t
}
dv(b)
m1$dev+m1$sp*b%*%S%*%b

t1-optim(rep(0,10),dv,method=BFGS)
t1$p
b

dv(t1$p)
dv(b)

fit1-exp(X%*%t1$p)

plot(x,y)
points(x,exp(lp),pch=16,col=green3)
points(x,fitted(m1),pch=16,cex=0.5,col=blue)
points(x,fit1,cex=1.5,col=red)





 please ignore this, I see the error.

 greg

 hi

 probably a silly mistake, but I expected gam to minimise the penalised
 deviance.

 thanks

 greg

 set.seed(1)
 library(mgcv)
 x-runif(100)
 lp-exp(-2*x)*sin(8*x)
 y-rpois(100,exp(lp))
 plot(x,y)
 m1-gam(y~s(x),poisson)
 points(x,exp(lp),pch=16,col=green3)
 points(x,fitted(m1),pch=16,cex=0.5,col=blue)
 W-diag(fitted(m1))
 X-predict(m1,type=lpmatrix)
 S-m1$smooth[[1]]$S[[1]]
 S-rbind(0,cbind(0,S))
 A-X%*%solve(t(X)%*%W%*%X+m1$sp*S)%*%t(X)%*%W
 sum(diag(A))
 sum(m1$edf)
 fit-fitted(m1)
 b-m1$coef
 range(exp(X%*%b)-fit)
 z-y/fit-1+X%*%b
 range(A%*%z-X%*%b)

 dv-function(t)
 {
 f-exp(X%*%t)
 -2*sum(y*log(f)-f-ifelse(y==0,0,y*log(y))+y)+t%*%S%*%t
 }
 dv(b)
 m1$dev+b%*%S%*%b

 #so far, so good


 t1-optim(rep(0,10),dv)
 t1$p
 b

 #different

 dv(t1$p)
 dv(b)

 #different, and dv(t1$p) is lower!

 fit1-exp(X%*%t1$p)
 points(x,fit1,pch=16,cex=0.5,col=red)

 # different
 # gam found b which does approximate the true curve, but does not
 minimise
 the penalised deviance, by a long shot.





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] gam and optim

2013-09-20 Thread Greg Dropkin
please ignore this, I see the error.

greg

 hi

 probably a silly mistake, but I expected gam to minimise the penalised
 deviance.

 thanks

 greg

 set.seed(1)
 library(mgcv)
 x-runif(100)
 lp-exp(-2*x)*sin(8*x)
 y-rpois(100,exp(lp))
 plot(x,y)
 m1-gam(y~s(x),poisson)
 points(x,exp(lp),pch=16,col=green3)
 points(x,fitted(m1),pch=16,cex=0.5,col=blue)
 W-diag(fitted(m1))
 X-predict(m1,type=lpmatrix)
 S-m1$smooth[[1]]$S[[1]]
 S-rbind(0,cbind(0,S))
 A-X%*%solve(t(X)%*%W%*%X+m1$sp*S)%*%t(X)%*%W
 sum(diag(A))
 sum(m1$edf)
 fit-fitted(m1)
 b-m1$coef
 range(exp(X%*%b)-fit)
 z-y/fit-1+X%*%b
 range(A%*%z-X%*%b)

 dv-function(t)
 {
 f-exp(X%*%t)
 -2*sum(y*log(f)-f-ifelse(y==0,0,y*log(y))+y)+t%*%S%*%t
 }
 dv(b)
 m1$dev+b%*%S%*%b

 #so far, so good


 t1-optim(rep(0,10),dv)
 t1$p
 b

 #different

 dv(t1$p)
 dv(b)

 #different, and dv(t1$p) is lower!

 fit1-exp(X%*%t1$p)
 points(x,fit1,pch=16,cex=0.5,col=red)

 # different
 # gam found b which does approximate the true curve, but does not minimise
 the penalised deviance, by a long shot.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] gam and optim

2013-09-20 Thread Greg Dropkin
hi

probably a silly mistake, but I expected gam to minimise the penalised
deviance.

thanks

greg

set.seed(1)
library(mgcv)
x-runif(100)
lp-exp(-2*x)*sin(8*x)
y-rpois(100,exp(lp))
plot(x,y)
m1-gam(y~s(x),poisson)
points(x,exp(lp),pch=16,col=green3)
points(x,fitted(m1),pch=16,cex=0.5,col=blue)
W-diag(fitted(m1))
X-predict(m1,type=lpmatrix)
S-m1$smooth[[1]]$S[[1]]
S-rbind(0,cbind(0,S))
A-X%*%solve(t(X)%*%W%*%X+m1$sp*S)%*%t(X)%*%W
sum(diag(A))
sum(m1$edf)
fit-fitted(m1)
b-m1$coef
range(exp(X%*%b)-fit)
z-y/fit-1+X%*%b
range(A%*%z-X%*%b)

dv-function(t)
{
f-exp(X%*%t)
-2*sum(y*log(f)-f-ifelse(y==0,0,y*log(y))+y)+t%*%S%*%t
}
dv(b)
m1$dev+b%*%S%*%b

#so far, so good


t1-optim(rep(0,10),dv)
t1$p
b

#different

dv(t1$p)
dv(b)

#different, and dv(t1$p) is lower!

fit1-exp(X%*%t1$p)
points(x,fit1,pch=16,cex=0.5,col=red)

# different
# gam found b which does approximate the true curve, but does not minimise
the penalised deviance, by a long shot.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] mgcv summary.gam

2013-06-21 Thread Greg Dropkin
hi

I'm trying to understand (a little) the code behind summary.gam, and have
the Biometrika article referred to in the Help. But am stuck early on.

In the code, starting at line 167:

if (est.disp)
  rdf - residual.df
else rdf - -1
res - testStat(p, Xt, V, df[i], type = p.type,
  res.df = rdf)

1) shouldn't there be some brackets in the if else statement?
2) what is testStat?

thanks

Greg

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] REML - quasipoisson

2012-10-01 Thread Greg Dropkin
hi

the explanation seems to be that fix.family.ls is invoked to define a
saturated log likelihood for quasi families, which means that the F2q term
which I had thought to be undefined, is actually defined in this example
as

F2q--500 * log(phiq)/2

The formula then becomes

F1q-F2q+F3q-F4q

which does coincide with m3$gcv

Note that when phiq=1, the F2q term vanishes.

greg




 hi

 I'm puzzled as to the relation between the REML score computed by gam and
 the formula (4) on p.4 here:
 http://opus.bath.ac.uk/22707/1/Wood_JRSSB_2011_73_1_3.pdf

 I'm ok with this for poisson, or for quasipoisson when phi=1.

 However, when phi differs from 1, I'm stuck.

 #simulate some data
 library(mgcv)
 set.seed(1)
 x1-runif(500)
 x2-rnorm(500)
 linp--0.5+x1+exp(-x2^2/2)*sin(4*x2)
 y-rpois(500,exp(linp))

 ##poisson
 #phi=1
 m1-gam(y~s(x1)+s(x2),family=poisson,method=REML)
 phi-m1$scale

 #formula
 #1st term
 S1-m1$smooth[[1]]$S[[1]]*m1$sp[1]
 S2-m1$smooth[[2]]$S[[1]]*m1$sp[2]
 S-matrix(0,19,19)
 for (i in 2:10)
 {
 for (j in 2:10)
 {
 S[i,j]=S1[i-1,j-1]
 S[i+9,j+9]=S2[i-1,j-1]
 }
 }
 beta-m1$coef
 #penalised deviance
 Dp-m1$dev+t(beta)%*%S%*%beta
 F1-Dp/(2*phi)

 #2nd term
 F2-sum(ifelse(y==0,0,y*log(y)-y-log(factorial(y

 #3rd term
 X-predict(m1,type=lpmatrix)
 W-diag(fitted(m1))
 H-t(X)%*%W%*%X
 ldhs-determinant(H+S,log=TRUE)$modulus[1]
 eigS-eigen(S,only.values=TRUE)$val
 lds-sum(log(eigS[1:16]))
 F3-(ldhs-lds)/2

 #4th term
 Mp=3
 F4-Mp/2*log(2*pi*phi)

 F1-F2+F3-F4
 m1$gcv
 #reml score = formula

 ##quasipoisson with scale = 1
 #fitting is identical to the poisson case
 #F1, F3, and F4 unchanged but F2 is now undefined

 m2-gam(y~s(x1)+s(x2),family=quasipoisson,method=REML,scale=1)
 F1+F3-F4
 m2$gcv
 #reml score = formula with F2 omitted

 ##quasipoisson with unknown scale
 m3-gam(y~s(x1)+s(x2),family=quasipoisson,method=REML,scale=-1)
 phiq-m3$scale

 #1st term
 S1q-m3$smooth[[1]]$S[[1]]*m3$sp[1]
 S2q-m3$smooth[[2]]$S[[1]]*m3$sp[2]
 Sq-matrix(0,19,19)
 for (i in 2:10)
 {
 for (j in 2:10)
 {
 Sq[i,j]=S1q[i-1,j-1]
 Sq[i+9,j+9]=S2q[i-1,j-1]
 }
 }
 betaq-m3$coef
 #penalised deviance
 Dpq-m3$dev+t(betaq)%*%Sq%*%betaq
 F1q-Dpq/(2*phiq)

 #2nd term undefined

 #3rd term
 Xq-predict(m3,type=lpmatrix)
 Wq-diag(fitted(m3))
 Hq-t(Xq)%*%Wq%*%Xq
 ldhsq-determinant(Hq+Sq,log=TRUE)$modulus[1]
 eigSq-eigen(Sq,only.values=TRUE)$val
 ldsq-sum(log(eigSq[1:16]))
 F3q-(ldhsq-ldsq)/2

 #4th term
 Mp=3
 F4q-Mp/2*log(2*pi*phiq)

 F1q+F3q-F4q
 m3$gcv

 #quite different

 #but if phiq is replaced by the Pearson estimate of the scale
 P-sum((y-fitted(m3))^2/fitted(m3))
 phip-P/(500-sum(m3$edf))
 F1p-Dpq/(2*phip)
 F3p-F3q
 #third term independent of scale
 F4p-Mp/2*log(2*pi*phip)
 F1p+F3p-F4p
 m3$gcv

 #closer but still wrong

 #is there a value of phi which makes this work?
 f-function(t) (Dpq/(2*t) + F3q + Mp/2*log(2*pi*t) - m3$gcv)^2
 optimise(f,interval=c(0.5,1.5))$min-phix

 Dpq/(2*phix) + F3q + Mp/2*log(2*pi*phix)
 m3$gcv
 #but what is phix - not the Pearson estimate or the scale returned by gam?

 thanks

 Greg Dropkin



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] REML - quasipoisson

2012-09-25 Thread Greg Dropkin
hi

I'm puzzled as to the relation between the REML score computed by gam and
the formula (4) on p.4 here:
http://opus.bath.ac.uk/22707/1/Wood_JRSSB_2011_73_1_3.pdf

I'm ok with this for poisson, or for quasipoisson when phi=1.

However, when phi differs from 1, I'm stuck.

#simulate some data
library(mgcv)
set.seed(1)
x1-runif(500)
x2-rnorm(500)
linp--0.5+x1+exp(-x2^2/2)*sin(4*x2)
y-rpois(500,exp(linp))

##poisson
#phi=1
m1-gam(y~s(x1)+s(x2),family=poisson,method=REML)
phi-m1$scale

#formula
#1st term
S1-m1$smooth[[1]]$S[[1]]*m1$sp[1]
S2-m1$smooth[[2]]$S[[1]]*m1$sp[2]
S-matrix(0,19,19)
for (i in 2:10)
{
for (j in 2:10)
{
S[i,j]=S1[i-1,j-1]
S[i+9,j+9]=S2[i-1,j-1]
}
}
beta-m1$coef
#penalised deviance
Dp-m1$dev+t(beta)%*%S%*%beta
F1-Dp/(2*phi)

#2nd term
F2-sum(ifelse(y==0,0,y*log(y)-y-log(factorial(y

#3rd term
X-predict(m1,type=lpmatrix)
W-diag(fitted(m1))
H-t(X)%*%W%*%X
ldhs-determinant(H+S,log=TRUE)$modulus[1]
eigS-eigen(S,only.values=TRUE)$val
lds-sum(log(eigS[1:16]))
F3-(ldhs-lds)/2

#4th term
Mp=3
F4-Mp/2*log(2*pi*phi)

F1-F2+F3-F4
m1$gcv
#reml score = formula

##quasipoisson with scale = 1
#fitting is identical to the poisson case
#F1, F3, and F4 unchanged but F2 is now undefined

m2-gam(y~s(x1)+s(x2),family=quasipoisson,method=REML,scale=1)
F1+F3-F4
m2$gcv
#reml score = formula with F2 omitted

##quasipoisson with unknown scale
m3-gam(y~s(x1)+s(x2),family=quasipoisson,method=REML,scale=-1)
phiq-m3$scale

#1st term
S1q-m3$smooth[[1]]$S[[1]]*m3$sp[1]
S2q-m3$smooth[[2]]$S[[1]]*m3$sp[2]
Sq-matrix(0,19,19)
for (i in 2:10)
{
for (j in 2:10)
{
Sq[i,j]=S1q[i-1,j-1]
Sq[i+9,j+9]=S2q[i-1,j-1]
}
}
betaq-m3$coef
#penalised deviance
Dpq-m3$dev+t(betaq)%*%Sq%*%betaq
F1q-Dpq/(2*phiq)

#2nd term undefined

#3rd term
Xq-predict(m3,type=lpmatrix)
Wq-diag(fitted(m3))
Hq-t(Xq)%*%Wq%*%Xq
ldhsq-determinant(Hq+Sq,log=TRUE)$modulus[1]
eigSq-eigen(Sq,only.values=TRUE)$val
ldsq-sum(log(eigSq[1:16]))
F3q-(ldhsq-ldsq)/2

#4th term
Mp=3
F4q-Mp/2*log(2*pi*phiq)

F1q+F3q-F4q
m3$gcv

#quite different

#but if phiq is replaced by the Pearson estimate of the scale
P-sum((y-fitted(m3))^2/fitted(m3))
phip-P/(500-sum(m3$edf))
F1p-Dpq/(2*phip)
F3p-F3q
#third term independent of scale
F4p-Mp/2*log(2*pi*phip)
F1p+F3p-F4p
m3$gcv

#closer but still wrong

#is there a value of phi which makes this work?
f-function(t) (Dpq/(2*t) + F3q + Mp/2*log(2*pi*t) - m3$gcv)^2
optimise(f,interval=c(0.5,1.5))$min-phix

Dpq/(2*phix) + F3q + Mp/2*log(2*pi*phix)
m3$gcv
#but what is phix - not the Pearson estimate or the scale returned by gam?

thanks

Greg Dropkin

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] mgcv: REML increase with new predictor

2012-02-24 Thread Greg Dropkin
hi Simon

this follows on from the example where gcv increased unexpectedly with
increasing basis dimension. I'm now looking at t2 tensor splines with
REML, and find that the REML score can increase when adding a new
predictor. Again, this seems odd.

thanks

Greg

library(mgcv)
set.seed(0)
#simulate some data
x1-runif(2000)
x2-rnorm(2000)
x3-rpois(2000,3)
d-runif(2000)
linp--5+x1+0.3*x2+4*cos(x1*x2)
y-rpois(2000,exp(linp))
#y does not depend on d
table(y)
sum(y)

m0-gam(y~t2(x1)+t2(x2)+t2(x1,x2),family=quasipoisson,method=REML,scale=-1)
sc0-m0$scale
sc0
m0-gam(as.formula(m0),family=quasipoisson,method=REML,scale=sc0)
m1-gam(update.formula(m0,~.+t2(d)),family=quasipoisson,method=REML,scale=sc0)

#dev has decreased slightly, though not significantly
m0$dev
m1$dev

#REML has increased, unexpected (to me)
m0$gcv
m1$gcv

#does this go away if using full=TRUE?

m0t-gam(y~t2(x1,full=TRUE)+t2(x2,full=TRUE)+t2(x1,x2,full=TRUE),family=quasipoisson,method=REML,scale=-1)
sc0t-m0t$scale
sc0t
m0t-gam(as.formula(m0t),family=quasipoisson,method=REML,scale=sc0t)
m1t-gam(update.formula(m0t,~.+t2(d,full=TRUE)),family=quasipoisson,method=REML,scale=sc0t)

#dev has still decreased slightly
m0t$dev
m1t$dev

#REML has still increased
m0t$gcv
m1t$gcv

#how about te rather than t2?

m0te-gam(y~te(x1)+te(x2)+te(x1,x2),family=quasipoisson,method=REML,scale=-1)
sc0te-m0t$scale
sc0te
m0te-gam(as.formula(m0t),family=quasipoisson,method=REML,scale=sc0t)
m1te-gam(update.formula(m0t,~.+te(d)),family=quasipoisson,method=REML,scale=sc0t)

#dev has still decreased slightly
m0te$dev
m1te$dev

#now REML decreases
m0te$gcv
m1te$gcv

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] mgcv: increasing basis dimension

2012-02-14 Thread Greg Dropkin
thanks Simon

I'll upgrade R to try t2. The data I'm actually analysing requires scaled
Poisson so I don't think REML is an option.

thanks

Greg

On 14/02/12 11:22 Simon Wood wrote:

That's interesting. Playing with the example, it doesn't seem to be a
local minimum. I think that this happens because, although the higher
rank basis contains the lower rank basis, the penalty can not simply
suppress all the extra components in the higher rank basis and recover
exactly what the lower rank basis gave: it's forced to include some of
the extra stuff, even if heavily penalized, and this is what is
degrading the higher rank fit in this case.

t2 tensor product smooths seem to be less susceptible to this effect,
and for reasons I don't understand so does REML based smoothness
selection (gam(...,method=REML))

best,
Simon


 hi

 Using a ts or tprs basis, I expected gcv to decrease when increasing the
 basis dimension, as I thought this would minimise gcv over a larger
 subspace. But gcv increased. Here's an example. thanks for any comments.

 greg

 #simulate some data
 set.seed(0)
 x1-runif(500)
 x2-rnorm(500)
 x3-rpois(500,3)
 d-runif(500)
 linp--1+x1+0.5*x2+0.3*exp(-2*d)*sin(10*d)*x3
 y-rpois(500,exp(linp))
 sum(y)

 library(mgcv)
 #basis dimension k=5
 m1-gam(y~x1+x2+te(d,bs=ts)+te(x3,bs=ts)+te(d,x3,bs=ts),family=poisson)

 #basis dimension k=10
 m2-gam(y~x1+x2+te(d,bs=ts,k=10)+te(x3,bs=ts,k=10)+te(d,x3,bs=ts,k=10),family=poisson)

 #gcv increased
 m1$gcv
 m2$gcv

 summary(m1)
 summary(m2)

 gam.check(m1)
 gam.check(m2)


 #is this due to bs=ts?

 #basis dimension k=5
 m1tp-gam(y~x1+x2+te(d,bs=tp)+te(x3,bs=tp)+te(d,x3,bs=tp),family=poisson)

 #basis dimension k=10
 m2tp-gam(y~x1+x2+te(d,bs=tp,k=10)+te(x3,bs=tp,k=10)+te(d,x3,bs=tp,k=10),family=poisson)

 m1tp$gcv
 m2tp$gcv

 #no

 summary(m1tp)
 summary(m2tp)

 gam.check(m1tp)
 gam.check(m2tp)



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] mgcv: increasing basis dimension

2012-02-13 Thread Greg Dropkin
hi

Using a ts or tprs basis, I expected gcv to decrease when increasing the
basis dimension, as I thought this would minimise gcv over a larger
subspace. But gcv increased. Here's an example. thanks for any comments.

greg

#simulate some data
set.seed(0)
x1-runif(500)
x2-rnorm(500)
x3-rpois(500,3)
d-runif(500)
linp--1+x1+0.5*x2+0.3*exp(-2*d)*sin(10*d)*x3
y-rpois(500,exp(linp))
sum(y)

library(mgcv)
#basis dimension k=5
m1-gam(y~x1+x2+te(d,bs=ts)+te(x3,bs=ts)+te(d,x3,bs=ts),family=poisson)

#basis dimension k=10
m2-gam(y~x1+x2+te(d,bs=ts,k=10)+te(x3,bs=ts,k=10)+te(d,x3,bs=ts,k=10),family=poisson)

#gcv increased
m1$gcv
m2$gcv

summary(m1)
summary(m2)

gam.check(m1)
gam.check(m2)


#is this due to bs=ts?

#basis dimension k=5
m1tp-gam(y~x1+x2+te(d,bs=tp)+te(x3,bs=tp)+te(d,x3,bs=tp),family=poisson)

#basis dimension k=10
m2tp-gam(y~x1+x2+te(d,bs=tp,k=10)+te(x3,bs=tp,k=10)+te(d,x3,bs=tp,k=10),family=poisson)

m1tp$gcv
m2tp$gcv

#no

summary(m1tp)
summary(m2tp)

gam.check(m1tp)
gam.check(m2tp)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Cubic splines in package mgcv

2011-08-16 Thread Greg Dropkin
re: Cubic splines in package mgcv

I don't have access to Gu (2002) but clearly the function R(x,z) defined
on p126 of Simon Wood's book is piecewise quartic, not piecewise cubic.

Like Kunio Takezawa (below) I was puzzled by the word cubic on p126.

As Simon Wood writes, this basis is not actually used by mgcv when
specifying bs=cr.

Maybe the point is that at the knot, this continuous function has
continuous 1st and 2nd derivatives, but a discontinuous 3rd derivative, so
in that sense it behaves like a cubic spline.

Greg


#using the code from p127 of Wood:
#compare Wood Fig 3.4 (p125)
#if the function were piecewise cubic the 3rd derivative would be
piecewise constant

rk-function(x,z)
{
((z-0.5)^2-1/12)*((x-0.5)^2-1/12)/4-((abs(x-z)-0.5)^4-(abs(x-z)-0.5)^2/2+7/240)/24
}

par(mfrow=c(2,2))
u-seq(0,1,by=0.001)
plot(u,rk(u,5/6),main=function)

plot(u[-1],1e3*diff(rk(u,5/6),differences=1),main=1st derivative)

plot(u[-(1:2)],1e6*diff(rk(u,5/6),differences=2),main=2nd derivative)

plot(u[-(1:3)],1e9*diff(rk(u,5/6),differences=3),main=3rd derivative)
par(mfrow=c(1,1))

---

From: Simon Wood s.wood
Date: Sun, 6 Jan 2008 16:59:35 +

On Wednesday 26 December 2007 04:14, Kunio takezawa wrote:
 R-users
 E-mail: r-help at r-project.org
 My understanding is that package mgcv is based on
 Generalized Additive Models: An Introduction with R (by Simon N. Wood).
 On the page 126 of this book, eq(3.4) looks a quartic equation with respect
 to
 x, not a cubic equation. I am wondering if all routines which uses
 cubic splines in mgcv are based on this quartic equation.
--- No, `mgcv' does not use the basis given on page 126. See sections
4.1.2-4.1.8 of the same book for the bases used.


 In my humble opinion, the '^4' in the first term
 of the second line of this equation should be '^3'.
--- Perhaps take a look at section 2.3.3 of Gu (2002) Smoothing Spline
ANOVA
for a bit more detail on this/



 K. Takezawa



-- 
 Simon Wood, Mathematical Sciences, University of Bath, Bath, BA2 7AY UK
 +44 1225 386603  www.maths.bath.ac.uk/~sw283

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] gam plots and seWithMean

2010-10-25 Thread Greg Dropkin
hadn't realised the answer would be in the source code!

anyway, this appears to work. The only difference is in the last section.

greg

--

library(mgcv)

#simulate some data
x1-runif(500)
x2-rnorm(500)
x3-rpois(500,3)
d-runif(500)
t-runif(500,20,50)
linp--6.5+x1+2*x2-x3+2*exp(-2*d)*sin(2*pi*d)
lam-t*exp(linp)
y-rpois(500,lam)
sum(y)
table(y)

#fit the data without d by glm and with d by gam
f1-glm(y~offset(log(t))+x1+x2+x3,poisson)
f2-gam(update.formula(as.formula(f1),~.+s(d)),poisson)
anova(f1,f2)
summary(f2)
plot(f2)

#the solid line s(d)
dat-data.frame(t,d,x1,x2,x3)
datn-transform(dat,d=0)
dif-predict(f2)-predict(f2,datn)
cdif-dif-mean(dif)
points(d,cdif,cex=0.5,col=rgb(0,1,0,0.2))

#another approach to the solid line s(d)
devAskNewPage(ask=T)
plot(f2)
premat2-PredictMat(f2$smooth[[1]],data=dat)
dim(premat2)
pars-f2$coef
pars2-pars[5:13]
pars2-as.matrix(pars2,9,1)
pars2
points(d,premat2%*%pars2,cex=0.5,col=rgb(0,0.6,0.3,0.2))
#premat2%*%pars2 = cdif

#confidence intervals when seWithMean = FALSE
devAskNewPage(ask=T)
plot(f2)
points(d,cdif,cex=0.5,col=rgb(0,1,0,0.2))
Vp2-f2$Vp[5:13,5:13]
se2-sqrt(diag(premat2%*%Vp2%*%t(premat2)))
points(d,cdif+qnorm(0.975)*se2,cex=0.5,col=rgb(1,0,0,0.2))
points(d,cdif-qnorm(0.975)*se2,cex=0.5,col=rgb(0,0,1,0.2))
#numerical output for the confidence bands is given by
#cdif+qnorm(0.975)*se2
#cdif-qnorm(0.975)*se2

#confidence intervals when seWithMean = TRUE
devAskNewPage(ask=T)
plot(f2,seWithMean=T)
points(d,cdif,cex=0.5,col=rgb(0,1,0,0.2))
premat-cbind(rep(1,500),x1,x2,x3,premat2)
pars-as.matrix(pars,13,1)
range(predict.gam(f2)-log(t)-as.numeric(premat%*%pars))
#so predict.gam(f2) = log(t) + as.numeric(premat%*%pars)
sew-sqrt(diag(premat%*%f2$Vp%*%t(premat)))
range(sew-predict(f2,se.fit=T)$se.fit)
#sew = predict(f2,se.fit=T)$se.fit
points(d,cdif+qnorm(0.975)*sew,cex=0.5,col=rgb(1,0,0,0.2))
points(d,cdif-qnorm(0.975)*sew,cex=0.5,col=rgb(0,0,1,0.2))
#so this is not what plot.gam is doing

#but this is
devAskNewPage(ask=T)
plot(f2,seWithMean=T)
points(d,cdif,cex=0.5,col=rgb(0,1,0,0.2))
premat-cbind(rep(1,500),x1,x2,x3,premat2)
pars-as.matrix(pars,13,1)
premat1-premat
premat1[,1:4]-rep(f2$cmX[1:4],each=500)
sew1-sqrt(diag(premat1%*%f2$Vp%*%t(premat1)))
points(d,cdif+qnorm(0.975)*sew1,cex=0.5,col=rgb(1,0,0,0.2))
points(d,cdif-qnorm(0.975)*sew1,cex=0.5,col=rgb(0,0,1,0.2))

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] gam plots and seWithMean

2010-10-21 Thread Greg Dropkin
hello

I'm learning mgcv and would like to obtain numerical output corresponding
to plot.gam.

I can do so when seWithMean=FALSE (the default)
but only approximately when seWithMean=TRUE.

Can anyone show how to obtain the exact values?

Alternatively, can you clarify the explanation in the manual
Note that, if seWithMean=TRUE, the confidence bands include
the uncertainty about the overall mean. In other words although
each smooth is shown centred, the confidence bands are obtained
as if every other term in the model was constrained to have average 0
(average taken over the covariate values), except for the smooth concerned.


an example with a poisson gam
(re-run this a few times as the plots can vary significantly)

---

library(mgcv)

#simulate some data
x1-runif(500)
x2-rnorm(500)
x3-rpois(500,3)
d-runif(500)
t-runif(500,20,50)
linp--6.5+x1+2*x2-x3+2*exp(-2*d)*sin(2*pi*d)
lam-t*exp(linp)
y-rpois(500,lam)
sum(y)
table(y)

#fit the data without d by glm and with d by gam
f1-glm(y~offset(log(t))+x1+x2+x3,poisson)
f2-gam(update.formula(as.formula(f1),~.+s(d)),poisson)
anova(f1,f2)
summary(f2)
plot(f2)

#the solid line s(d)
dat-data.frame(t,d,x1,x2,x3)
datn-transform(dat,d=0)
dif-predict(f2)-predict(f2,datn)
cdif-dif-mean(dif)
points(d,cdif,cex=0.5,col=rgb(0,1,0,0.2))

#another approach to the solid line s(d)
devAskNewPage(ask=T)
plot(f2)
premat2-PredictMat(f2$smooth[[1]],data=dat)
dim(premat2)
pars-f2$coef
pars2-pars[5:13]
pars2-as.matrix(pars2,9,1)
pars2
points(d,premat2%*%pars2,cex=0.5,col=rgb(0,0.6,0.3,0.2))
#premat2%*%pars2 = cdif

#confidence intervals when seWithMean = FALSE
devAskNewPage(ask=T)
plot(f2)
points(d,cdif,cex=0.5,col=rgb(0,1,0,0.2))
Vp2-f2$Vp[5:13,5:13]
se2-sqrt(diag(premat2%*%Vp2%*%t(premat2)))
points(d,cdif+qnorm(0.975)*se2,cex=0.5,col=rgb(1,0,0,0.2))
points(d,cdif-qnorm(0.975)*se2,cex=0.5,col=rgb(0,0,1,0.2))
#numerical output for the confidence bands is given by
#cdif+qnorm(0.975)*se2
#cdif-qnorm(0.975)*se2

#confidence intervals when seWithMean = TRUE
devAskNewPage(ask=T)
plot(f2,seWithMean=T)
points(d,cdif,cex=0.5,col=rgb(0,1,0,0.2))
premat-cbind(rep(1,500),x1,x2,x3,premat2)
pars-as.matrix(pars,13,1)
range(predict.gam(f2)-log(t)-as.numeric(premat%*%pars))
#so predict.gam(f2) = log(t) + as.numeric(premat%*%pars)
sew-sqrt(diag(premat%*%f2$Vp%*%t(premat)))
range(sew-predict(f2,se.fit=T)$se.fit)
#sew = predict(f2,se.fit=T)$se.fit
points(d,cdif+qnorm(0.975)*sew,cex=0.5,col=rgb(1,0,0,0.2))
points(d,cdif-qnorm(0.975)*sew,cex=0.5,col=rgb(0,0,1,0.2))
#sort of

#smoothing the sew estimates
devAskNewPage(ask=T)
plot(f2,seWithMean=T)
points(d,cdif,cex=0.5,col=rgb(0,1,0,0.2))
lines(smooth.spline(cdif+qnorm(0.975)*sew~d),col=red)
lines(smooth.spline(cdif-qnorm(0.975)*sew~d),col=blue)
points(smooth.spline(cdif+qnorm(0.975)*sew~d)$x,smooth.spline(cdif+qnorm(0.975)*sew~d)$y,cex=0.8,col=rgb(1,0,0,0.1))
points(smooth.spline(cdif+qnorm(0.975)*sew~d)$x,smooth.spline(cdif-qnorm(0.975)*sew~d)$y,cex=0.8,col=rgb(0,0,1,0.1))
#close, but not exact
#numerical values corresponding to sort(d) are extracted as
#smooth.spline(cdif+qnorm(0.975)*sew~d)$y
#smooth.spline(cdif-qnorm(0.975)*sew~d)$y

#smoothing sew with 93% confidence levels
devAskNewPage(ask=T)
plot(f2,seWithMean=T)
points(d,cdif,cex=0.5,col=rgb(0,1,0,0.2))
lines(smooth.spline(cdif+qnorm(0.965)*sew~d),col=red)
lines(smooth.spline(cdif-qnorm(0.965)*sew~d),col=blue)
points(smooth.spline(cdif+qnorm(0.965)*sew~d)$x,smooth.spline(cdif+qnorm(0.965)*sew~d)$y,cex=0.8,col=rgb(1,0,0,0.1))
points(smooth.spline(cdif+qnorm(0.965)*sew~d)$x,smooth.spline(cdif-qnorm(0.965)*sew~d)$y,cex=0.8,col=rgb(0,0,1,0.1))
#possibly closer, but not exact
#numerical values corresponding to sort(d) are extracted as
#smooth.spline(cdif+qnorm(0.965)*sew~d)$y
#smooth.spline(cdif-qnorm(0.965)*sew~d)$y

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] matrix^(-1/2)

2009-12-04 Thread Greg Dropkin
re [R] matrix^(-1/2)

re the discussion in November on this thread. I don't know about expm but
the problem must be equivalent to solve(B^(1/2)) and a solution will exist
iff B is invertible and has a square root A with A%*%A = B. For 2x2
matrices necessary and sufficient conditions for B to have a square root
are that either B = diag(0,2) or B%*%B != diag(0,2). This follows from the
fact that B%*%B - t(B)*B + det(B) = 0 and A%*%A - t(A)*A + det(A) = 0.
Many non-symmetric B satisfy B%*%B != diag(0,2) and many of them are
invertible. e.g. all the rotations.

If expm has a function, perhaps Exp, which exponentiates matrices using
matrix multiplication, then if B = Exp(C), B^(-1/2) = Exp(-C/2) would be a
solution. There is an open neighbourhood of the identity matrix in which B
= Exp(C) must hold. Since t(B) = Exp(t(C)), such B will not in general be
symmetric.

If N satisfies N%*%N=0, then B = I+N where I is the identity matrix will
have B^(-1/2) = 1-N/2. Such B are not symmetric as N cannot be symmetric
(for non-zero N).

So, for B^(-1/2) to exist, B must be invertible but need not be symmetric.

hope that helps

Greg

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] scaled Schoenfeld residuals

2009-09-24 Thread Greg Dropkin
hi

sorry if this has been discussed before, but I'm wondering why the scaled
Schoenfeld residuals do not follow the defining formula for obtaining them
from the ordinary Schoenfeld residuals, but are instead offset by the
estimated parameter values.

e.g.

library(survival)
attach(ovarian)
sv-Surv(futime,fustat)
f1-coxph(sv~age+ecog.ps)
f1
schres-resid(f1,type=schoenfeld)
schresc-resid(f1,type=scaledsch)
n=sum(fustat)
V-f1$var
schresc1-t(n*V%*%t(schres))

#schresc1 is how the scaled Schoenfeld residuals are defined
#in terms of the number of events
#variance of the parameter estimates,
#and ordinary Schoenfeld residuals
#but schresc1 and schresc differ

schresc
schresc1

#schresc is schresc1 offset by the parameter estimates

beta-as.vector(f1$coef)
nbeta-outer(rep(1,n),beta)
nbeta
schresc-nbeta
schresc1

#is there a reason for the offset
#or am I missing something?

thanks

Greg

Greg Dropkin
gr...@gn.apc.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] scaled Schoenfeld residuals

2009-09-24 Thread Greg Dropkin
hi

thanks, I see that cox.zph is plotting and smoothing the scaled
Schoenfeld residuals as generated by R, but since the term is already in
the literature with a formula, maybe the help should clarify the offset. I
found it confusing anyway.

thanks for help

greg



Thomas Lumley tlumley at u.washington.edu
Thu Sep 24 16:18:17 CEST 2009

Previous message: [R] scaled Schoenfeld residuals
Next message: [R] generate random number without repetition
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]



On Wed, 23 Sep 2009, Greg Dropkin wrote:

 hi

 sorry if this has been discussed before, but I'm wondering why the scaled
 Schoenfeld residuals do not follow the defining formula for obtaining them
 from the ordinary Schoenfeld residuals, but are instead offset by the
 estimated parameter values.


Because their purpose in life is to be smoothed against time to get an
estimate of the parameter as a function of time (plot.cox.zph).

  -thomas

Thomas Lumley   Assoc. Professor, Biostatistics
tlumley at u.washington.edu University of Washington, Seattle

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] score test statistic in logistic regression

2009-03-03 Thread Greg Dropkin
re post from

bkelcey at umich.edu bkelcey at umich.edu
Wed Feb 27 15:09:48 CET 2008

If the response y is given as the proportion of successes out of n trials,
and y, n, p, x, and z are vectors of length M, and the model is logit(p) =
b0 + b1*x + b2*z then for the score test for the null hypothesis b1=0 use

des-array(c(rep(1,M),x,z),dim=c(M,3))
m0-glm(y~z,binomial,weights=n)
f-fitted(m0)
efscor-1:3
for (i in 1:3)
{
efscor[i]-sum((y-f)*n*des[,i])
}
fim-outer(1:3,1:3)
for (i in 1:3)
{
for (j in 1:3)
{
fim[i,j]-sum(des[,i]*des[,j]*n*f*(1-f))
}
}
score-crossprod(efscor,crossprod(solve(fim),efscor))
score


ok?

greg

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.