[R] gam function time trend splines

2007-07-02 Thread Kevin Sorensen
I've been doing a simple time-series analysis looking
at the relationship between daily pneumonia
hospitalizations and daily temperature.  To mimic some
of the literature, I've been including a time-trend to
try to account for normal cyclical trends in
hospitalization.  So I've been using a function that
looks something like this:

gam(pneucount ~ temp_f +
s(day,bs=cr,k=(4*totalyears)+1),

day being the enumerated day in the analysis (1-365
for a 1 year analysis). 

This seems to work well enough.  What troubles me is
when I think about doing an analysis focusing on
winter days using more than one year of data.  If I
just delete the summer days from the dataset, the time
trend spline is trying to anneal counts from the end
of one winter with the beginning of another, which
doesn't seem right to me.  

What's the route to a statistically defensible result?
 Is it as simple as using the subset option?  Or would
I need to create indicator variables for each winter
I'm interested and work in a by statement somehow
(with an extra term for the levels of that indicator,
I assume)?  

Thanks in advance for helping a Epi student who's
being exposed to all this for the first time.

Sincerely,

Kevin Sorensen 


  

Park yourself in front of a world of choices in alternative vehicles. Visit the 
Yahoo! Auto Green Center.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] gam function time trend spline

2007-07-02 Thread Roger Peng
If you're looking only at winter days then you probably don't need to
remove seasonal trends, do you?

-roger

On 7/2/07, Kevin Sorensen [EMAIL PROTECTED] wrote:
 I've been doing a simple time-series analysis looking
 at the relationship between daily pneumonia
 hospitalizations and daily temperature.  To mimic some
 of the literature, I've been including a time-trend to
 try to account for normal cyclical trends in
 hospitalization.  So I've been using a function that
 looks something like this:

 gam(pneucount ~ temp_f +
 s(day,bs=cr,k=(4*totalyears)+1),

 day being the enumerated day in the analysis (1-365
 for a 1 year analysis).

 This seems to work well enough.  What troubles me is
 when I think about doing an analysis focusing on
 winter days using more than one year of data.  If I
 just delete the summer days from the dataset, the time
 trend spline is trying to anneal counts from the end
 of one winter with the beginning of another, which
 doesn't seem right to me.

 What's the route to a statistically defensible result?
  Is it as simple as using the subset option?  Or would
 I need to create indicator variables for each winter
 I'm interested and work in a by statement somehow
 (with an extra term for the levels of that indicator,
 I assume)?

 Thanks in advance for helping a Epi student who's
 being exposed to all this for the first time.

 Sincerely,

 Kevin Sorensen


   
 
 Park yourself in front of a world of choices in alternative vehicles. Visit 
 the Yahoo! Auto Green Center.

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Roger D. Peng  |  http://www.biostat.jhsph.edu/~rpeng/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] GAM for censored data? (survival analysis)

2007-06-29 Thread Eric Peterson
First let me admit that I am no statistician... rather, an ecologist with
just enough statistical knowledge to be dangerous.
 
I've got a dataset with percent ground cover values for species and other
entities.  The data are left censored at zero, in that percent ground cover
cannot be negative.  (My data rarely reach 100% cover so I haven't bothered
with adding a right censoring at 100).  I've done some previous analyses
using survival analysis methods to create a predictive model for an entity
of particular interest... library(survival); survreg(Surv(Y) ~ X).
 
However, I know my data do not really match linear modeling and would like
to work with some alternate methods, one of which is GAM.  I noticed that
Yee and Mitchell (1991, p.589) stated that GAM is appropriate for certain
types of survival data.  How do I implement a survival data model in GAM
with R?  I've searched both R help and the R site search, but not found
anything relevant.  
 
Would it be as simple as library(survival); library(mgcv); gam(Surv(Y) ~
X)  ???
 
While I have your attention, I have a related second question.  I'd like to
model one entity (percent ground cover) as a function of another (also
percent ground cover).  Is there any way to deal with a censored predictor
variable as well as the censored response?
 
Citation: Yee, T. W.  N. D. Mitchell.  1991.  Generalized additive models
in plant ecology.  Journal of Vegetation Science 2: 587-602.
 
Thanks,
-Eric Peterson
Vegetation Ecologist
Nevada Natural Heritage Program

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] gam function in the mgcv library

2007-06-25 Thread Bill Wheeler
I would like to fit a logistic regression using a smothing spline, where the 
spline is a piecewise cubic polynomial. Is the knots option used to define the 
subintervals for each piece of the cubic spline? If yes and there are k knots, 
then why does the coefficients field in the returned object from gam only list 
k coefficients? Shouldn't there be 4k -4 coefficients?

Sincerely,

Bill

   
-

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] gam function in the mgcv library

2007-06-25 Thread Simon Wood
On Monday 25 June 2007 13:26, Bill Wheeler wrote:
 I would like to fit a logistic regression using a smothing spline, where
 the spline is a piecewise cubic polynomial. Is the knots option used to
 define the subintervals for each piece of the cubic spline? 
- if you use something like 
gam(y~s(x,bs=cr,k=5),family=binomial,knots=list(x=c(0,.1,.3,.4,.8))
then yes, k is the number of knots and the `knots' list specifies where they 
occur. If you use the default `bs=tp' then the spline basis functions are 
not really `knot' based, being instead an ordered set of eigenfunctions, that 
are optimal in a defined sense (see Wood, 2003, JRSSB).

 If yes and 
 there are k knots, then why does the coefficients field in the returned
 object from gam only list k coefficients? Shouldn't there be 4k -4
 coefficients?

A k knot natural cubic spline only has k free coefficients, so that is all 
that mgcv:gam reports. If you are thinking about sections of cubic, then the 
other 3 coefficients of each section are determined by the spline  continuity 
conditions + the conditions of having zero second derivative at the end 
knots. Exact details of the `mgcv' cr basis are given in section 4.1.2 of 
my 2006  book (see ?gam), but all you really need to know is that it's a 
natural cubic spline basis parameterized in terms of function heights at the 
knots (although there  is a gam identifiability constraint absorbed into the 
parameterization which muddies this neat interpretability a little). 

best,
Simon


 Sincerely,

 Bill


 -

   [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html and provide commented, minimal,
 self-contained, reproducible code.

-- 
 Simon Wood, Mathematical Sciences, University of Bath, Bath, BA2 7AY UK
 +44 1225 386603  www.maths.bath.ac.uk/~sw283

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] GAM/GLM parameters

2007-04-04 Thread james155

I think this might be a very basic question, but is there a simple way to
characterise the relationships that a gam or lm model have identified? I am
trying the create species distribution models based on climate, and want to
know whether, for example, higher temperatures (one of the predictor
variables) leads to a higher probability of species presence (dependent
variable). Also, how can you quantify the relative contribution of each
predictor variable to the final model?

Many thanks,

 James
-- 
View this message in context: 
http://www.nabble.com/GAM-GLM-parameters-tf3525876.html#a9837219
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] gam parameter predictions --Sorry for double posting

2007-03-27 Thread Luis Ridao Cruz
R-help,

Sorry for posting again the same question (dated 26-03-2007) but 
all my mails have been sent to the recycle bin without possibility
of recovering and thus I don't know if anyone has answer my query.

Here is the original message:

I'm applying a gam model (package mgcv) to predict
relative abundances of a fish species.

The covariates are year, month, vessel and statistical rectangle.


The model looks like this:

g1 - gam(log(cpue) ~  s(rekt1) + s(year) + s(mon) + s(reg1), data =
dataTest)


Once the model is fitted to the data I want to get the mean model
estimates by year.

I do the following:

obsPred - data.frame(year = dataTest$year, pred = predict(g1, type =
response))

gamFit - tapply(obsPred$pred, list(year = obsPred$ar), mean)



Is this correct?



Thanks in advance


 version
   _   
platform   i386-pc-mingw32 
arch   i386
os mingw32 
system i386, mingw32   
status 
major  2   
minor  4.1 
year   2006
month  12  
day18  
svn rev40228   
language   R   
version.string R version 2.4.1 (2006-12-18)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] gam parameter predictions

2007-03-26 Thread Luis Ridao Cruz
R-help,

I'm applying a gam model (package mgcv) to predict
relative abundances of a fish species.

The covariates are year, month, vessel and statistical rectangle.


The model looks like this:

g1 - gam(log(cpue) ~  s(rekt1) + s(year) + s(mon) + s(reg1), data =
dataTest)


Once the model is fitted to the data I want to get the mean model
estimates by year.

I do the following:

obsPred - data.frame(year = dataTest$year, pred = predict(g1, type =
response))

gamFit - tapply(obsPred$pred, list(year = obsPred$ar), mean)



Is this correct?



Thanks in advance


 version
   _   
platform   i386-pc-mingw32 
arch   i386
os mingw32 
system i386, mingw32   
status 
major  2   
minor  4.1 
year   2006
month  12  
day18  
svn rev40228   
language   R   
version.string R version 2.4.1 (2006-12-18)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] GAM model selection and dropping terms based on GCV

2006-12-04 Thread aditya gangadharan
Hello,
I have a question regarding model selection and dropping of terms for GAMs 
fitted with package mgcv. I am following the approach suggested in Wood (2001), 
Wood and Augustin (2002).
 
I fitted a saturated model, and I find from the plots that for two of the 
covariates,
1. The confidence interval includes 0 almost everywhere
2. The degrees of freedom are NOT close to 1
3. The partial residuals from plot.gam don’t show much pattern visually (to me)
4. When I drop either or both of the terms, the GCV score increases;

This is my main problem: how much of an increase in GCV is ‘acceptable’ when 
terms are dropped? In the above case, the delta GCV scores are .03, .06 and .11 
when I drop covariate A, covariate B and both respectively from the full model. 
 
I would be very grateful for any advice on this.

Thank you
Best Wishes
Aditya

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] GAM model selection and dropping terms based on GCV

2006-12-04 Thread Simon Wood
On Monday 04 December 2006 12:30, aditya gangadharan wrote:
 Hello,
 I have a question regarding model selection and dropping of terms for GAMs
 fitted with package mgcv. I am following the approach suggested in Wood
 (2001), Wood and Augustin (2002).

 I fitted a saturated model, and I find from the plots that for two of the
 covariates, 1. The confidence interval includes 0 almost everywhere
 2. The degrees of freedom are NOT close to 1
 3. The partial residuals from plot.gam don’t show much pattern visually (to
 me) 4. When I drop either or both of the terms, the GCV score increases;

 This is my main problem: how much of an increase in GCV is ‘acceptable’
 when terms are dropped? In the above case, the delta GCV scores are .03,
 .06 and .11 when I drop covariate A, covariate B and both respectively from
 the full model. I would be very grateful for any advice on this.
- I'm not sure that there is really an answer to this. GCV  is based on 
minimizing some approximation to the expected prediction error of the model. 
So to answer the question you'd need to do something like decide how much 
increase from `optimal' prediction error you would be prepared to tolerate. 
I think that it's not all that easy to come up with a nice way of blending  
prediction error based approaches to model selection, with approaches based 
on finding a model that is somehow the simplest model consistent with the 
data (but perhaps other people will comment on this). 

- That said, there is certainly an issue relating to the fact that the GCV 
score (or AIC, in fact) is rather asymmetric, so that random variability in 
the score tends to lead more readily to overfitting than to underfitting. 
This suggests that in fact prediction error performance at finite sample 
sizes may be improved by shrinking the smoothing parameters themselves. With 
`mgcv::gam' you can do this by increasing the `gamma' parameter above it's 
default value, which favours smoother models by making each model degree of 
freedom count as gamma degrees of freedom in the GCV score (or AIC/UBRE). It 
is possible to choose `gamma' by e.g. 10-fold cross-validation, but that 
requires some coding.

- There are more discussions of GAM model selection in various mgcv help files 
and my book. See help(mgcv-package) for details of which pages, and the 
reference. 

My bottom line on model seelction is to use things like GCV, AIC, confidence 
interval coverage and approximate p-values for guidance, but not as the basis 
for rules... modelling context has to play a part as well. 

Sorry if that's all a bit vague.

Simon


-- 
 Simon Wood, Mathematical Sciences, University of Bath, Bath, BA2 7AY UK
 +44 1225 386603  www.maths.bath.ac.uk/~sw283

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] gam() question

2006-11-15 Thread Simon Wood
 Hi everyone,
 I am fitting a bivariate smoothing model by using gam.
 But I got an error message like this:
 Error in eigen(hess1, symmetric = TRUE) : 0 x 0 matrix
- this is a known problem in mgcv 1.3-20 (an optimizer fails to cope with  
convergence in one step).  It's fixed in 1.3-21, which I'll try and get 
uploaded to CRAN today.

Simon

-- 
 Simon Wood, Mathematical Sciences, University of Bath, Bath, BA2 7AY UK
 +44 1225 386603  www.maths.bath.ac.uk/~sw283

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] gam() question

2006-11-14 Thread seeTigers
Hi everyone,
I am fitting a bivariate smoothing model by using gam.
But I got an error message like this:
Error in eigen(hess1, symmetric = TRUE) : 0 x 0 matrix

If anyone know how to figure it out, pleaselet me know.
Thanks very much.

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] gam() question

2006-11-14 Thread Andrew Robinson
Hello,

it's really difficult for anyone to make a constructive response based
on your message.  The problem could be in:

1) the function you fit (which one is it?, and which package?)
2) the arguments that you supplied (what did you tell it to do?)
3) the data that you gave it (what are they?)

Try the following:

PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html 
and provide commented, minimal, self-contained, reproducible code.

Good luck!

Andrew

On Tue, Nov 14, 2006 at 04:40:09PM -0500, seeTigers wrote:
 Hi everyone,
 I am fitting a bivariate smoothing model by using gam.
 But I got an error message like this:
 Error in eigen(hess1, symmetric = TRUE) : 0 x 0 matrix
 
 If anyone know how to figure it out, pleaselet me know.
 Thanks very much.
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Andrew Robinson  
Department of Mathematics and StatisticsTel: +61-3-8344-9763
University of Melbourne, VIC 3010 Australia Fax: +61-3-8344-4599
http://www.ms.unimelb.edu.au/~andrewpr
http://blogs.mbs.edu/fishing-in-the-bay/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] GAM Package: preplot.gam taking a **long** time

2006-08-14 Thread Paul Check
Hi: I have a large data set that I'm testing and I'm finding that 
preplot.gam is taking a very long amount of time to compute (like, more 
than 20 minutes). My machine is 32-bit, Debian unstable, 4GB memory, 
dual Xeon 3GHz. While the data set is very large, the gam() procedure is 
able to compute the model without any trouble, in a minute or so. Is 
there any reason why preplot.gam would be so slow? I am not using 
newdata in the preplot.gam function, so I am assuming that the memory 
is not a problem. Or could it be?

Is it normal for preplot.gam to take so long on large data sets? I have 
not had this experience with S-Plus, on a lower quality machine, same data.

Thanks, Paul

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] GAM 2D-plotting

2006-08-04 Thread Nixon Bahamon
Hi,

When I fit a GAM model (using mgvc) with overlapping terms, such as

gam(y~s(x,z)+s(z,w))

and afterwards I pretend to plot the component smooth functions that make it up
using plot.gam, I achieve a couple of 2D plots.

My question is: What's the meaning of those 2D plots in terms of y?

Regards,

Nixon

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] GAM selection error msgs (mgcv gam packages)

2006-06-21 Thread Simon Wood

 My question concerns 2 error messages; one in the gam package and one in
 the mgcv package (see below). I have read help files and Chambers and
 Hastie book but am failing to understand how I can solve this problem.
 Could you please tell me what I must adjust so that the command does not
 generate error message?

 I am trying to achieve model selection for a GAM which is required for
 prediction purposes, thus my focus is on AIC. My data set has 3038 records
 and 116 predictor variables and a binary response variable [0 or 1]. There
 is no current understanding of the predictors' relationship to response so
 I am relying on GAM for selection of appropriate predictors.

- I have some worries about using a GAM in this sort of situation - it 
seems like an odd model to start from to me: you don't know the 
relationship to the covariates, but do know that it should be additive? Is 
that really true? If it is then it may still be alot to ask of the model 
selection methods to find a good model. (I'd certainly consider upping 
the `gamma' parameter in mgcv:::gam).

- General uneasiness apart, the specific warning message relates to the 
number of distinct covariate values that you have (or number of distinct 
X,Y,Z triplets). Do any of the covariates for single smooths have fewer 
than 10 distinct values? There are more than 50 distinct x,y,z triplets, I 
suppose? If you have distinct fewer covariate points for a smooth than the 
default k (10), then you need to reduce k to the number of distinct 
points, or fewer.

- Finally, for speed reasons, I'd use the cr basis (see ?s) if doing 
this.

best,
Simon

- Simon Wood, Mathematical Sciences, University of Bath, Bath BA2 7AY 
- +44 (0)1225 386603 www.maths.bath.ac.uk/~sw283/



 Thanks
 Savrina

 *mgcv package 1.3-12:

 # I start with specifying the full model with 116 predictors including
 isotropic smooth of 3D location variables (when I specify only the first
 14 predictors I get no error message)

 m0-gam(label~s(x,y,z,k=50),s+(feature4)+s(feature5)+s(feature6)+...+s(feature116),data=k.data,
 family=binomial)

 Error in smooth.construct.tp.smooth.spec(object, data, knots):
 A term has fewer unique covariate combinations than specified maximum
 degrees of freedom

 # I was going to follow this with backwards selection by hypothesis testing
 (remove highest p-val term one at a time) and also AIC comparison of all
 the models

 From help file entitled 'Generalised additive models with integrated
 smoothness estimation' I calculated the following where do I go from here?
 A) k is the basis dimension of a given term...if k is not specified
 k=10*3^(d-1) where 'd' is the number of covariates for this term
 My calculations: for all my terms but the first d=1 thus k=10*3^0=10.
 B) You must have more unique combinations of covariates than the model has
 total parameters
 My calculations: total parameters = sum of basis dimensions(50+10*113) +
 sum of non-spline terms(0) - number of spline terms(114) = 1066

 *gam package:
 I think stepwise selection provided by gam package would be useful in
 finding the best predictive model. I follow example on pg 283 from
 'Statistical models in S' Chambers and Hastie 1993.
 # I start with a full model where all predictors enter linearly
 k.start-gam(label~., data=k.data, family=binomial)

 # set up scope list with possibilities for each term eg .~1 + x + s(x)
 # ignore the first column of the data set
 k.scope-gam.scope(k.data[,-1])

 # start step wise selection
 k.step-step(k.start,k.scope)
 #condensed output
 Start: AIC=1549.48
 label~s+y+z+feature4+feature5+...+feature116
Df Deviance   AIC
 none 1319.5 1549.5
 - feature54 -1 1319.2 1551.2
 - feature26 -1 1319.2 1551.2
 ...
 -feature12  -1 1357.4 1589.4
 There were 50 or more warnings (use warnings() to see the first 50)

 # all 50 warnings are the same
 warnings()
 Warning messages:
 1: fitted probabilities numerically 0 or 1 occurred in: glm.fit(x[, jj,
 drop = FALSE], y, wt, offset = object$offset,   ...

 # it seems to not get passed the orginal linear model. It should show all
 the steps taken to the final model
 k.step$anova
  Step Df Deviance Resid. Df Resid. Dev  AIC
 1  NA   NA  2922   1317.599 1549.599

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] GAM selection error msgs (mgcv gam packages)

2006-06-18 Thread scarrizo
Hi all,

My question concerns 2 error messages; one in the gam package and one in
the mgcv package (see below). I have read help files and Chambers and
Hastie book but am failing to understand how I can solve this problem.
Could you please tell me what I must adjust so that the command does not
generate error message?

I am trying to achieve model selection for a GAM which is required for
prediction purposes, thus my focus is on AIC. My data set has 3038 records
and 116 predictor variables and a binary response variable [0 or 1]. There
is no current understanding of the predictors' relationship to response so
I am relying on GAM for selection of appropriate predictors.

Thanks
Savrina

*mgcv package 1.3-12:

# I start with specifying the full model with 116 predictors including
isotropic smooth of 3D location variables (when I specify only the first
14 predictors I get no error message)

m0-gam(label~s(x,y,z,k=50),s+(feature4)+s(feature5)+s(feature6)+...+s(feature116),data=k.data,
family=binomial)

Error in smooth.construct.tp.smooth.spec(object, data, knots):
 A term has fewer unique covariate combinations than specified maximum
degrees of freedom

# I was going to follow this with backwards selection by hypothesis testing
(remove highest p-val term one at a time) and also AIC comparison of all
the models

From help file entitled 'Generalised additive models with integrated
smoothness estimation' I calculated the following where do I go from here?
A) k is the basis dimension of a given term...if k is not specified
k=10*3^(d-1) where 'd' is the number of covariates for this term
My calculations: for all my terms but the first d=1 thus k=10*3^0=10.
B) You must have more unique combinations of covariates than the model has
total parameters
My calculations: total parameters = sum of basis dimensions(50+10*113) +
sum of non-spline terms(0) - number of spline terms(114) = 1066

*gam package:
I think stepwise selection provided by gam package would be useful in
finding the best predictive model. I follow example on pg 283 from
'Statistical models in S' Chambers and Hastie 1993.
# I start with a full model where all predictors enter linearly
 k.start-gam(label~., data=k.data, family=binomial)

# set up scope list with possibilities for each term eg .~1 + x + s(x)
# ignore the first column of the data set
 k.scope-gam.scope(k.data[,-1])

# start step wise selection
 k.step-step(k.start,k.scope)
#condensed output
Start: AIC=1549.48
 label~s+y+z+feature4+feature5+...+feature116
Df Deviance   AIC
none 1319.5 1549.5
- feature54 -1 1319.2 1551.2
- feature26 -1 1319.2 1551.2
...
-feature12  -1 1357.4 1589.4
There were 50 or more warnings (use warnings() to see the first 50)

# all 50 warnings are the same
 warnings()
Warning messages:
1: fitted probabilities numerically 0 or 1 occurred in: glm.fit(x[, jj,
drop = FALSE], y, wt, offset = object$offset,   ...

# it seems to not get passed the orginal linear model. It should show all
the steps taken to the final model
 k.step$anova
  Step Df Deviance Resid. Df Resid. Dev  AIC
1  NA   NA  2922   1317.599 1549.599

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] gam problem

2006-04-18 Thread vkatoma

Hello,

People. I ahve been trying to use mle for normal distribution data set but
always reporting an erroe on gam object. is there a solution to this

Victor

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] gam y-axis interpretation

2006-03-28 Thread Simon Wood
All smooths in a GAM are `centred' in order to ensure model 
identifiability. This means that a smooth, s, is estimated subject to the 
constraint that \sum_i s(x_i)=0, where, x_i, are the covariate values. 
So you can't transform back to the response scale just by applying the 
inverse link, even if there is only one smooth. In the single smooth case, 
you would need to add on the model intercept before applying the inverse 
link. If you need plots on the response scale then it is best to use the 
`predict' method function and have it return results on the `response' 
scale...

best,
Simon

- Simon Wood, Mathematical Sciences, University of Bath, Bath BA2 7AY
- +44 (0)1225 386603 www.maths.bath.ac.uk/~sw283/


On Thu, 23 Mar 2006, Bliese, Paul D LTC USAMH wrote:

 Sorry if this is an obvious question...



 I'm estimating a simple binomial generalized additive model using the
 gam function in the package mgcv.  The model makes sense given my data,
 and the predicted values also make sense given what I know about the
 data.



 However, I'm having trouble interpreting the y-axis of the plot of the
 gam object.  The y-axis is labeled s(x,2.52) which I understand to
 basically mean a smoothing estimator with approximately 2.52 degrees of
 freedom.  The y-axis in my case ranges from -2 to 6 and I thought that
 it would be possible to convert the Y axis estimate to a probability via
 exp(Y)/(1+exp(Y)).  So for instance, my lowest y-axis estimate is -2 for
 a probability of:

 exp(-2)/(1+exp(-2))

 [1] 0.1192029



 However, if I use the predict function my lowest estimate is -3.53862893
 for a probability of 2.8%.  The 2.8% estimate is a much better estimate
 than 11.9% given my specific data, so I'm clearly not interpreting the
 plot correctly.



 The help files say plot.gam provides the component smooth functions
 that make it up, on the scale of the

 linear predictor.



 I'm just not sure what that description means.  Does someone have
 another description that might help me grasp the plot?



 Similar plots are on page 286 of Venables and Ripley (3rd Edition)...



 Thanks,



 Paul






   [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] gam y-axis interpretation

2006-03-23 Thread Bliese, Paul D LTC USAMH
Sorry if this is an obvious question...

 

I'm estimating a simple binomial generalized additive model using the
gam function in the package mgcv.  The model makes sense given my data,
and the predicted values also make sense given what I know about the
data.

 

However, I'm having trouble interpreting the y-axis of the plot of the
gam object.  The y-axis is labeled s(x,2.52) which I understand to
basically mean a smoothing estimator with approximately 2.52 degrees of
freedom.  The y-axis in my case ranges from -2 to 6 and I thought that
it would be possible to convert the Y axis estimate to a probability via
exp(Y)/(1+exp(Y)).  So for instance, my lowest y-axis estimate is -2 for
a probability of:

 exp(-2)/(1+exp(-2))

[1] 0.1192029

 

However, if I use the predict function my lowest estimate is -3.53862893
for a probability of 2.8%.  The 2.8% estimate is a much better estimate
than 11.9% given my specific data, so I'm clearly not interpreting the
plot correctly.

 

The help files say plot.gam provides the component smooth functions
that make it up, on the scale of the

 linear predictor.

 

I'm just not sure what that description means.  Does someone have
another description that might help me grasp the plot? 

 

Similar plots are on page 286 of Venables and Ripley (3rd Edition)...

 

Thanks,

 

Paul

 

 


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] GAM using R tutorials?

2006-03-16 Thread Michael
Interesting! Very interesting! You seem to assume that I haven't done
anything as you've listed below:

I have even listed a book that I was aware of but I could not find in our
local library... and doing some research and I have already started
playing with gam, and everybody on earth knows that you can do ?gam and
help.search() in R to find info, but after I have done all those, I still
feel that I need some more infohelp...

I don't know how you reach your ungrounded conclusion?

I think you are wasting everybody's bandwidth and time by not providing
constructive suggestions but discouraging (and overly repetitive) comments
disregarding the effort a newbie had already paid...

I think you really need to consult posting guide yourself...

And remember, you can always remain silent -- you don't need to show off
that you are an experienced user and for newbie your first response is
always something like scolding: why didn't you search?

I strongly believe that your answer is inappropriate!!!






On 3/15/06, Gavin Simpson [EMAIL PROTECTED] wrote:

 On Tue, 2006-03-14 at 23:52 -0800, Michael wrote:
  Hi all,
 
  I am trying to use GAM to work on some data... Are there any resources
  providing hands-on tutorial/guide on how to do GAM on data in R?
  Specifically, I am not sure about which model to choose, and smooth
 models
  with which effective degree-of-freedom shall I use...
 
  I knew there is a book titled: GAM: an introduction using R. Unfornately
 our
  local library does not have it... so that's not an option given time
  constraint.
 
  Thanks a lot for your pointers!
 
  Michael.

 Michael,

 Please learn to use the search tools provided for you! You have posted
 numerous emails to the list recently, many of which you could have
 solved for yourself if only you'd heeded peoples' advice and searched
 for yourself.

 For this problem;

 1) I'd suggest to the local library that they might consider buying the
 book, but in the meantime...

 2) ...in R, do RSiteSearch(GAM) and look at the list shown in your
 browser. The first hit is the help page for package mgcv. Look at the
 references included on that help page - most are technical/statistical
 papers, but a starting point might be the RNews article Simon Wood
 wrote.

 That should get yourself started. But if you'd done the search yourself,
 you wouldn't have had to wait for someone on the list to do it for you.

 Finally - Please read the posting guide - it is there for a reason.

 HTH

 G
 --
 %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Gavin Simpson [T] +44 (0)20 7679 5522
 ENSIS Research Fellow [F] +44 (0)20 7679 7565
 ENSIS Ltd.  ECRC [E] gavin.simpsonATNOSPAMucl.ac.uk
 UCL Department of Geography   [W] http://www.ucl.ac.uk/~ucfagls/cv/
 26 Bedford Way[W] http://www.ucl.ac.uk/~ucfagls/
 London.  WC1H 0AP.
 %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%




[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] GAM using R tutorials?

2006-03-16 Thread Michael
Hi Templ,

That's very helpful! Indeed I've printed out the gam portion and I am
digesting now...

Thank you so much!

I really appreicate your constructive and helpful advice!

Best,

Michael.


On 3/15/06, TEMPL Matthias [EMAIL PROTECTED] wrote:

 Have you looked at:

 An Introduction to R: Software for StatisticalModelling  Computing by
 Petra Kuhnert and Bill Venables

 which is available at http://cran.r-project.org/other-docs.html

 Hope this helps.

 Best,
 Matthias


 
  Hi all,
 
  I am trying to use GAM to work on some data... Are there any
  resources providing hands-on tutorial/guide on how to do GAM
  on data in R? Specifically, I am not sure about which model
  to choose, and smooth models with which effective
  degree-of-freedom shall I use...
 
  I knew there is a book titled: GAM: an introduction using R.
  Unfornately our local library does not have it... so that's
  not an option given time constraint.
 
  Thanks a lot for your pointers!
 
  Michael.
 
[[alternative HTML version deleted]]
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read
  the posting guide! http://www.R-project.org/posting-guide.html
 


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] GAM using R tutorials?

2006-03-16 Thread Gavin Simpson
On Thu, 2006-03-16 at 00:25 -0800, Michael wrote:
 Interesting! Very interesting! You seem to assume that I haven't done
 anything as you've listed below:
 
 I have even listed a book that I was aware of but I could not find in our
 local library... and doing some research and I have already started
 playing with gam, and everybody on earth knows that you can do ?gam and
 help.search() in R to find info, but after I have done all those, I still
 feel that I need some more infohelp...
 
 I don't know how you reach your ungrounded conclusion?
 
 I think you are wasting everybody's bandwidth and time by not providing
 constructive suggestions but discouraging (and overly repetitive) comments
 disregarding the effort a newbie had already paid...
 
 I think you really need to consult posting guide yourself...
 
 And remember, you can always remain silent -- you don't need to show off
 that you are an experienced user and for newbie your first response is
 always something like scolding: why didn't you search?
 
 I strongly believe that your answer is inappropriate!!!

What is inappropriate about pointing you at Simon Wood's article on gam
() (mgcv) in R and how to use it? Was this not what you asked for?

My other comments were related to the the fact that this *was* in the
References section of ?gam, which the posting guide does ask you to
read. One can only assume you missed this reference when you looked at
the page or did not realise the significance of it.

And for the record, wherever possible I do try to reply constructively -
if I didn't have an answer to your question I would not have replied to
the list - but I did, and it was on the documentation provided...

G


 On 3/15/06, Gavin Simpson [EMAIL PROTECTED] wrote:
 http://bugzilla.gnome.org/show_bug.cgi?id=323724
  On Tue, 2006-03-14 at 23:52 -0800, Michael wrote:
   Hi all,
  
   I am trying to use GAM to work on some data... Are there any resources
   providing hands-on tutorial/guide on how to do GAM on data in R?
   Specifically, I am not sure about which model to choose, and smooth
  models
   with which effective degree-of-freedom shall I use...
  
   I knew there is a book titled: GAM: an introduction using R. Unfornately
  our
   local library does not have it... so that's not an option given time
   constraint.
  
   Thanks a lot for your pointers!
  
   Michael.
 
  Michael,
 
  Please learn to use the search tools provided for you! You have posted
  numerous emails to the list recently, many of which you could have
  solved for yourself if only you'd heeded peoples' advice and searched
  for yourself.
 
  For this problem;
 
  1) I'd suggest to the local library that they might consider buying the
  book, but in the meantime...
 
  2) ...in R, do RSiteSearch(GAM) and look at the list shown in your
  browser. The first hit is the help page for package mgcv. Look at the
  references included on that help page - most are technical/statistical
  papers, but a starting point might be the RNews article Simon Wood
  wrote.
 
  That should get yourself started. But if you'd done the search yourself,
  you wouldn't have had to wait for someone on the list to do it for you.
 
  Finally - Please read the posting guide - it is there for a reason.
 
  HTH
 
  G
  --
  %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
  Gavin Simpson [T] +44 (0)20 7679 5522
  ENSIS Research Fellow [F] +44 (0)20 7679 7565
  ENSIS Ltd.  ECRC [E] gavin.simpsonATNOSPAMucl.ac.uk
  UCL Department of Geography   [W] http://www.ucl.ac.uk/~ucfagls/cv/
  26 Bedford Way[W] http://www.ucl.ac.uk/~ucfagls/
  London.  WC1H 0AP.
  %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 
 
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
Gavin Simpson [T] +44 (0)20 7679 5522
ENSIS Research Fellow [F] +44 (0)20 7679 7565
ENSIS Ltd.  ECRC [E] gavin.simpsonATNOSPAMucl.ac.uk
UCL Department of Geography   [W] http://www.ucl.ac.uk/~ucfagls/cv/
26 Bedford Way[W] http://www.ucl.ac.uk/~ucfagls/
London.  WC1H 0AP.
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] GAM using R tutorials?

2006-03-16 Thread Keith Chamberlain
Dear Templ ( Others),

Thank you for pointing out the contributed papers. It is one of those things
that I have been introduced to more than once, but I personally never
remembered to check the section out while in the acute phase of
problem-can't do it-what the heck is going on. 

Dear Michael,

Regarding frustrating replies, those will happen. I lost sleep over such
things in the past, only to learn that I didn't really have time to get
caught up. It is normal when investing time and effort into a reply to want
some kind of assurance that some legwork has been done. I think Gavin just
didn't get the sought after assurance from your message. The reply may not
have been particularly useful to you, per say, but it did serve as a
reminder to a wider audience of subscribers who read posts.

Given that this is done over email, and given the huge diversity of
subscribers around the globe with different language histories, the error
rate in communication could only increase from that of your or my locale.
Hence, the importance of the posting guide  FAQs.

More often than not, however, I've later found replies that I previously
considered terse or inappropriate to be quite accurate. On the other side, I
walked through an RD lab once  pointed out a solution to a router
configuration problem that the technician had been working on for a few
days. Having been trained on that one system, it was quite obvious what
needed to be fixed. I expected the technician to be grateful, but he never
spoke to me again! 

Well, that's my 2 cents. I do acknowledge that it wasn't requested. My hope
is that someone out there in listland would find it useful, even if not you.
Yes, I guess I am having a conversation with the wind... 

Rgds,
KeithC.

Thu, 16 Mar 2006 00:26:46 -0800
Michael [EMAIL PROTECTED]
Re: [R] GAM using R tutorials?

Hi Templ,

That's very helpful! Indeed I've printed out the gam portion and I am
digesting now...

Thank you so much!

I really appreicate your constructive and helpful advice!

Best,

Michael.


On 3/15/06
TEMPL Matthias [EMAIL PROTECTED] wrote:

 Have you looked at:

 An Introduction to R: Software for StatisticalModelling  Computing by
 Petra Kuhnert and Bill Venables

 which is available at http://cran.r-project.org/other-docs.html

 Hope this helps.

 Best,
 Matthias

Thu, 16 Mar 2006 00:25:56 -0800
Michael [EMAIL PROTECTED]
Re: [R] GAM using R tutorials?

Interesting! Very interesting! You seem to assume that I haven't done
anything as you've listed below:

I have even listed a book that I was aware of but I could not find in our
local library... and doing some research and I have already started
playing with gam, and everybody on earth knows that you can do ?gam and
help.search() in R to find info, but after I have done all those, I still
feel that I need some more infohelp...

I don't know how you reach your ungrounded conclusion?

I think you are wasting everybody's bandwidth and time by not providing
constructive suggestions but discouraging (and overly repetitive) comments
disregarding the effort a newbie had already paid...

I think you really need to consult posting guide yourself...

And remember, you can always remain silent -- you don't need to show off
that you are an experienced user and for newbie your first response is
always something like scolding: why didn't you search?

I strongly believe that your answer is inappropriate!!!

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] GAM using R tutorials?

2006-03-15 Thread TEMPL Matthias
Have you looked at:

An Introduction to R: Software for StatisticalModelling  Computing by
Petra Kuhnert and Bill Venables

which is available at http://cran.r-project.org/other-docs.html

Hope this helps.

Best,
Matthias


 
 Hi all,
 
 I am trying to use GAM to work on some data... Are there any 
 resources providing hands-on tutorial/guide on how to do GAM 
 on data in R? Specifically, I am not sure about which model 
 to choose, and smooth models with which effective 
 degree-of-freedom shall I use...
 
 I knew there is a book titled: GAM: an introduction using R. 
 Unfornately our local library does not have it... so that's 
 not an option given time constraint.
 
 Thanks a lot for your pointers!
 
 Michael.
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list 
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read 
 the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] GAM using R tutorials?

2006-03-15 Thread Gavin Simpson
On Tue, 2006-03-14 at 23:52 -0800, Michael wrote:
 Hi all,
 
 I am trying to use GAM to work on some data... Are there any resources
 providing hands-on tutorial/guide on how to do GAM on data in R?
 Specifically, I am not sure about which model to choose, and smooth models
 with which effective degree-of-freedom shall I use...
 
 I knew there is a book titled: GAM: an introduction using R. Unfornately our
 local library does not have it... so that's not an option given time
 constraint.
 
 Thanks a lot for your pointers!
 
 Michael.

Michael,

Please learn to use the search tools provided for you! You have posted
numerous emails to the list recently, many of which you could have
solved for yourself if only you'd heeded peoples' advice and searched
for yourself.

For this problem;

1) I'd suggest to the local library that they might consider buying the
book, but in the meantime...

2) ...in R, do RSiteSearch(GAM) and look at the list shown in your
browser. The first hit is the help page for package mgcv. Look at the
references included on that help page - most are technical/statistical
papers, but a starting point might be the RNews article Simon Wood
wrote.

That should get yourself started. But if you'd done the search yourself,
you wouldn't have had to wait for someone on the list to do it for you.

Finally - Please read the posting guide - it is there for a reason.

HTH

G
-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
Gavin Simpson [T] +44 (0)20 7679 5522
ENSIS Research Fellow [F] +44 (0)20 7679 7565
ENSIS Ltd.  ECRC [E] gavin.simpsonATNOSPAMucl.ac.uk
UCL Department of Geography   [W] http://www.ucl.ac.uk/~ucfagls/cv/
26 Bedford Way[W] http://www.ucl.ac.uk/~ucfagls/
London.  WC1H 0AP.
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] GAM using R tutorials?

2006-03-14 Thread Michael
Hi all,

I am trying to use GAM to work on some data... Are there any resources
providing hands-on tutorial/guide on how to do GAM on data in R?
Specifically, I am not sure about which model to choose, and smooth models
with which effective degree-of-freedom shall I use...

I knew there is a book titled: GAM: an introduction using R. Unfornately our
local library does not have it... so that's not an option given time
constraint.

Thanks a lot for your pointers!

Michael.

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] gam

2006-01-26 Thread Simon Wood
 I'm new to both R and to this list and would like to get
 advice on how to build generalized additive models in R.
 Based on the description of gam, which I found on the R
 website, I specified the following model:
 model1-gam(ST~s(MOWST1),family=binomial,data=strikes.S),
 in which ST is my binary response variable and MOWST1 is a
 categorical independent variable.

 I get the following error message:
 Error in smooth.construct.tp.smooth.spec(object, data,
 knots) :
  NA/NaN/Inf in foreign function call (arg 1)

- I guess this should maybe get trapped a bit earlier, so that you get
a more informative warning.

- The basic problem is that gams are based around sums of smooth functions
of covariates. For the notion of smooth to be meaningful the covariates
have to live in a space where you have at least a notion of distance
between the covariates, since in some loose sense `smooth' means that
f(x_1) must be close to f(x_2) if x_1 and x_2 are close. For factors you
doen't generally have any notion of distance between the levels of a
factor. (e.g. if a factor has levels brick, sky and purple, how far
is it from brick to purple?)

- Even if a factor is naturally ordered (e.g. small, medium, large),
you would still have to decide on how to measure smoothness/wiggliness of
a function of the factor. For this reason, I think that it is actually
better to explicitly convert levels of an ordered factor into numeric
values on a scale that you think is appropriate, before using the ordered
factor as the covariate in a gam. In this way it's usually fairly easy to
get one of the mgcv built in smoother classes to use the notion of
smoothness that you think is appropriate: if not then it's not too hard to
add a smoother class, following the template provided in ?p.spline
(actually you could use this template to write a smoother class for
ordered catagorical predictors).

best,
Simon

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] gam

2006-01-20 Thread Henric Nilsson
I.Szentirmai said the following on 2006-01-19 19:43:

  Dear R users,
 
  I'm new to both R and to this list and would like to get
  advice on how to build generalized additive models in R.
  Based on the description of gam, which I found on the R

Which `gam'? Note that R ships with package `mgcv' which has a `gam' 
function, but also package `gam' on CRAN has a `gam' function. 
(Furthermore, several other packages exists with functions that I'd 
categorize as GAM-fitters, e.g. SemiPar, assist, gss, gamlss, ...)

  website, I specified the following model:
  model1-gam(ST~s(MOWST1),family=binomial,data=strikes.S),
  in which ST is my binary response variable and MOWST1 is a
  categorical independent variable.
 
  I get the following error message:
  Error in smooth.construct.tp.smooth.spec(object, data,

 From this error message, I can however deduce that we're talking about 
the `mgcv::gam' function.

  knots) :
   NA/NaN/Inf in foreign function call (arg 1)
  In addition: Warning messages:
  1: argument is not numeric or logical: returning NA in:
  mean.default(xx)
  2: - not meaningful for factors in: Ops.factor(xx,
  shift[i])
 
  I would greatly appreciate if someone could tell me what I
  did wrong. Can I use categorical independents in gam at
  all?

It's not clear to me what you mean by this. Yes, you can use factors in gam:

gam(ST ~ MOWST1, family = binomial, data = strikes.S)

would work. But you tried smoothing a factor, which isn't supported (and 
to me it doesn't make any sense doing so).

Smoothing an ordered factor may make sense, but this is not supported 
(and you didn't try it, according to the error message above) by `mgcv'. 
  I was under the impression that the `gam' function in package `gam' 
should be able to do this, but I just tried it and was rewarded by the 
error message

Error: 'codes' is defunct.

relating to the internals of `gam' using a defunct R function -- I've 
e-mailed Prof Hastie, maintainer of package `gam', about this.

Even if it worked, the `gam' package won't allow estimation of the 
degree of smoothness of the model terms as part of the fitting process. 
So if this is what you want in combination with ordered factors, you're 
probably out of luck. (You can always send Prof Wood, `mgcv' maintainer, 
a feature request.)


HTH,
Henric

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] gam

2006-01-19 Thread I.Szentirmai
Dear R users,

I'm new to both R and to this list and would like to get 
advice on how to build generalized additive models in R. 
Based on the description of gam, which I found on the R 
website, I specified the following model:
model1-gam(ST~s(MOWST1),family=binomial,data=strikes.S),
in which ST is my binary response variable and MOWST1 is a 
categorical independent variable.

I get the following error message:
Error in smooth.construct.tp.smooth.spec(object, data, 
knots) :
 NA/NaN/Inf in foreign function call (arg 1)
In addition: Warning messages:
1: argument is not numeric or logical: returning NA in: 
mean.default(xx)
2: - not meaningful for factors in: Ops.factor(xx, 
shift[i])

I would greatly appreciate if someone could tell me what I 
did wrong. Can I use categorical independents in gam at 
all?

Many thanks,
Istvan

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] GAM and AIC: How can I do??? please

2005-10-25 Thread Simon Wood
Please send R and mgcv version numbers, as I can't replicate this
problem. best, Simon



Hello,  I'm a Korean researcher who have been started to learn the R
package.

I want to make gam model and AIC value of the model to compare several
models.

I did the GAM model, but there were error for AIC.

SO, how can I do? pleas help me!!!



I did like below;


 a.fit - gam(pi~ s(t1r), family = gaussian(link=log))
 summary(a.fit)


Family: gaussian
Link function: log

Formula:
pi ~ s(t1r)

Parametric coefficients:
   Estimate  std. err.t ratioPr(|t|)
constant   0.093105   0.005238  17.77 2.22e-16

Approximate significance of smooth terms:
  edf   chi.sq p-value
s(t1r)  1.833   24.153 0.00014213

R-sq.(adj) =  0.435   Deviance explained = 47.1%
GCV score = 0.0010938   Scale est. = 0.00099053  n = 30

 AIC(a.fit)
Error in logLik(object) : no applicable method for logLik


Eun A Kim, MD, MPH, Ph.D
Senior Researcher
Occupational safety and Health Research Institute
Korea Occupational Safety and Health Agency
TEL : +82-32-510-0910, FAX: +82-32-518-0862
Address:  34-4 Gusan-dong, Bupyung-Gu, Incheon city, 430-711, Republic
of Korea
Home Fax +82-(303)3111-0573
[EMAIL PROTECTED]msgid=%3C20051024015532.0
[EMAIL PROTECTED][EMAIL PROTECTED]
chkey=bd02e77f5ec97f754394e2adff337f11]
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] GAM and AIC: How can I do??? please

2005-10-24 Thread Martin Maechler
 ronggui == ronggui  [EMAIL PROTECTED]
 on Mon, 24 Oct 2005 10:09:30 +0800 writes:

ronggui === 2005-10-24 09:55:32
ronggui úÚ´ÅдÀº===

  Hello, I'm a Korean researcher who have been started to
 learn the R package.
 
 I want to make gam model and AIC value of the model to
 compare several models.
 
 I did the GAM model, but there were error for AIC.
 SO, how can I do? pleas help me!!!
 
 
 I did like below;

 
  a.fit - gam(pi~ s(t1r), family = gaussian(link=log))
  summary(a.fit)

Family: gaussian
Link function: log
 
Formula:
pi ~ s(t1r)
 
Parametric coefficients:
   Estimate  std. err.t ratioPr(|t|)
constant   0.093105   0.005238  17.77 2.22e-16
 
Approximate significance of smooth terms:
  edf   chi.sq p-value
s(t1r)  1.833   24.153 0.00014213
 
R-sq.(adj) =  0.435   Deviance explained = 47.1%
GCV score = 0.0010938   Scale est. = 0.00099053  n = 30

ronggui are you using the mgcv package?  if you are,just
ronggui use a.fit$aic to get the aic.

hmm, yes, and no:

It's true what you say,  
BUT is not at all recommended in general:

You should use the generic AIC() function
rather than extracting components yourself.

This is a general priniciple:  If possible use  'extractor functions'
to work on objects rather then relying on internal
representations.

This is particularly relevant for fitted models:

Do use  residuals(.), fitted(.), LogLik(.), AIC(.), vcov(.) 
etc etc!


Now back to this problem:

 AIC(a.fit) 
 Error in logLik(object) : no applicable method for logLik

I can't reproduce this; Eun definitely needs to give more
details, since the following works fine:

 library(mgcv)
 x - 1:50
 set.seed(1)
 y - 2^(sin(x/10) + rnorm(50))
 a.fit - gam(y ~ s(x), family = gaussian(link=log))
 summary(a.fit)

Family: gaussian 
Link function: log 

Formula:
y ~ s(x)

Parametric coefficients:
Estimate Std. Error t value Pr(|t|)  
(Intercept)   0.3171 0.1251   2.535   0.0147 *
---
Signif. codes:  ..{UTF-8 code}

Approximate significance of smooth terms:
   edf Est.rankF p-value   
s(x) 2.8589.000 3.07 0.00576 **
---
Signif. codes: 

R-sq.(adj) =0.4   Deviance explained = 43.5%
GCV score = 0.94391   Scale est. = 0.87107   n = 50

 AIC(a.fit)
[1] 140.6937


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] GAM and AIC: How can I do??? please

2005-10-23 Thread Eun A Kim

   Hello,  I'm a Korean researcher who have been started to learn the R
   package.

   I want to make gam model and AIC value of the model to compare several
   models.

   I did the GAM model, but there were error for AIC.

   SO, how can I do? pleas help me!!!



   I did like below;


a.fit - gam(pi~ s(t1r), family = gaussian(link=log))
summary(a.fit)


   Family: gaussian
   Link function: log

   Formula:
   pi ~ s(t1r)

   Parametric coefficients:
  Estimate  std. err.t ratioPr(|t|)
   constant   0.093105   0.005238  17.77 2.22e-16

   Approximate significance of smooth terms:
 edf   chi.sq p-value
   s(t1r)  1.833   24.153 0.00014213

   R-sq.(adj) =  0.435   Deviance explained = 47.1%
   GCV score = 0.0010938   Scale est. = 0.00099053  n = 30

AIC(a.fit)
   Error in logLik(object) : no applicable method for logLik


   Eun A Kim, MD, MPH, Ph.D
   Senior Researcher
   Occupational safety and Health Research Institute
   Korea Occupational Safety and Health Agency
   TEL : +82-32-510-0910, FAX: +82-32-518-0862
   Address:  34-4 Gusan-dong, Bupyung-Gu, Incheon city, 430-711, Republic
   of Korea
   Home Fax +82-(303)3111-0573
   [EMAIL PROTECTED]msgid=%3C20051024015532.0
   [EMAIL PROTECTED][EMAIL PROTECTED]
   chkey=bd02e77f5ec97f754394e2adff337f11]
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] GAM and AIC: How can I do??? please

2005-10-23 Thread ronggui



=== 2005-10-24 09:55:32 您在来信中写道:===


   Hello,  I'm a Korean researcher who have been started to learn the R
   package.

   I want to make gam model and AIC value of the model to compare several
   models.

   I did the GAM model, but there were error for AIC.

   SO, how can I do? pleas help me!!!



   I did like below;


a.fit - gam(pi~ s(t1r), family = gaussian(link=log))
summary(a.fit)


   Family: gaussian
   Link function: log

   Formula:
   pi ~ s(t1r)

   Parametric coefficients:
  Estimate  std. err.t ratioPr(|t|)
   constant   0.093105   0.005238  17.77 2.22e-16

   Approximate significance of smooth terms:
 edf   chi.sq p-value
   s(t1r)  1.833   24.153 0.00014213

   R-sq.(adj) =  0.435   Deviance explained = 47.1%
   GCV score = 0.0010938   Scale est. = 0.00099053  n = 30

are you using the mgcv package?
if you are,just use a.fit$aic to get the aic.

AIC(a.fit)
   Error in logLik(object) : no applicable method for logLik


   Eun A Kim, MD, MPH, Ph.D
   Senior Researcher
   Occupational safety and Health Research Institute
   Korea Occupational Safety and Health Agency
   TEL : +82-32-510-0910, FAX: +82-32-518-0862
   Address:  34-4 Gusan-dong, Bupyung-Gu, Incheon city, 430-711, Republic
   of Korea
   Home Fax +82-(303)3111-0573
   [EMAIL PROTECTED]msgid=%3C20051024015532.0
   [EMAIL PROTECTED][EMAIL PROTECTED]
   chkey=bd02e77f5ec97f754394e2adff337f11]
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

= = = = = = = = = = = = = = = = = = = =



 

2005-10-24

--
Deparment of Sociology
Fudan University

My new mail addres is [EMAIL PROTECTED]
Blog:http://sociology.yculblog.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] GAM weights

2005-07-27 Thread daniel . pastor
Dear all,
we are trying to model some data from rare plants so we always have less than 50
1x1 km presences, and the total area is about 550.000 square km. So we have a
real problem, when we perform a GAM, if we consider only the same amount of
absences than presences.
We have thought to use a greater number of absences but in this case we shoud
downweight them.
Does anybody know how to use the “wheight” term?
thank you in advance
daniel

--
Mensaje enviado mediante una herramienta Webmail integrada en *El Rincon*:
- https://rincon.uam.es --

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] GAM/MGCV: UBRE Score

2005-05-05 Thread Jean G Orelien
In comparing 2 GAM models are there some guidelines to determine which model
might be a better fit by comparing the 2 UBRE scores?  I assume that a
larger value is indicative of a better fit? What magnitude of difference is
a significant one?  I imagine that one could also use the percent of
explained deviance.  Is that a reliable statistic?  Are there any other
statistics that one would use for model selection?

 

Jean G. Orelien

Senior Biostatistician

 

***

SciMetrika, LLC

2 Davis Drive

RTP, NC 27709

 

Tel: (919)765-0017 (1210)

Fax: (919)990-8561

 

Email: [EMAIL PROTECTED]

Website: http://www.scimetrika.com

***

 


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] gam in library(gam)

2005-03-25 Thread Kerry Bush
I know there are two versions of gam in R. One is in
library(mgcv) and one is in library(gam). The one in
mgcv can automatically calculate the smoothing
parameter. However, the one in gam can't although it
can incorporate a larger variety of smoothers (besides
spline). Can anybody educate me if there is a way to
do smoothing parameter selection in gam from
library(gam)? I know I can always program
cross-validation by myself. But it might be more
friendly for the software if it can take this into
account automatically. (like gam in library(mgcv))

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] gam(mgcv) starting values

2005-02-14 Thread Bjrn Stollenwerk
Hi all!
Ive got some problems with the function gam (library mgcv). For some 
models I get the error message :

Error: no valid set of coefficients has been found:please supply 
starting values
In addition: Warning message:
NaNs produced in: log(x)

This is a shortened code I used:
gam(y ~ M1 + M3 + M4 + M5 + M6 + sex + M1*M3 + s(age),
family=Gamma(link =identity),
weights=days)
If I add for example an additional variable, say M7, the error-message 
occures. If I add M7 in combination with for example M8 it works.

Does somebody know, how to supply starting values or how to handle this 
problem.

I didnt suxceed by adding
control=gam.control(spIterType=outer),
or by
sp=137722.1
Thank you very much,
Bjrn
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] gam(mgcv) starting values

2005-02-14 Thread Simon Wood
My guess is that your model is predicting a negative mean for some of your 
data. Since this is not possible for a Gamma r.v. the deviance calculation 
returns something non finite, which triggers the error message. This is 
possible because you have used an identity link. Is it not possible to use 
a log link?

If you have to use an identity link then I'd first check that 

y ~ M1 + M3 + M4 + M5 + M6 + M7+ sex + M1*M3 + age

works. If it does, then you could try starting with a very large min.sp 
argument when fitting the model with s(age), and slowly reducing it until 
the the estimated smoothing parameter is non-zero --- if this works then 
you've succeeded in finding the best fit model without any E(y) becoming 
negative in the process, but if it doesn't it probably means either that 
the model structure is wrong, or some E(y) really is very close to zero.

I doubt that altering starting values is likely to help here (the starting 
values won't make any E(y)=0, after all).

best,
Simon

_
 Simon Wood [EMAIL PROTECTED]www.stats.gla.ac.uk/~simon/
  Department of Statistics, University of Glasgow, Glasgow, G12 8QQ
   Direct telephone: (0)141 330 4530  Fax: (0)141 330 4814


 

 Hi all!
 
 Ive got some problems with the function gam (library mgcv). For some 
 models I
 get the error message :
 
 Error: no valid set of coefficients has been found:please supply starting
 values
 In addition: Warning message:
 NaNs produced in: log(x)
 
 This is a shortened code I used:
 
 gam(y ~ M1 + M3 + M4 + M5 + M6 + sex + M1*M3 + s(age),
 family=Gamma(link =identity),
 weights=days)
 
 If I add for example an additional variable, say M7, the error-message
 occures. If I add M7 in combination with for example M8 it works.
 
 Does somebody know, how to supply starting values or how to handle this
 problem.
 
 I didnt suxceed by adding
 control=gam.control(spIterType=outer),
 or by
 sp=137722.1
 
 Thank you very much,
 
 Bjrn
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] GAM: Remedial measures

2005-01-14 Thread Simon Wood
 I fitted a GAM model with Poisson distribution to a data with about 200
 observations.  I noticed that the plot of the residuals versus fitted values
 show a trend.  Residuals tend to be lower for higher fitted values. Because,
 I'm dealing with count data, I'm thinking that this might be due to
 overdispersion.  Is there a way to account for overdispersion in any of the
 packages MGCV or GAM?  

You can `allow for' overdispersion in mgcv::gam by using the quasipoisson 
family, or setting scale to -1 in the gam call. In a straight GLM this 
would make no difference to the residual plots, since the scale parameter 
does not change the coefficient estimates. However, things are different 
for a GAM with automatic smoothness estimations, since the scale parameter 
does influence the smoothing parameter estimation criterion. Another 
possibility is to use the negative binomial family from the MASS library, 
and a third is to use the quasi family.

Simon
_
 Simon Wood [EMAIL PROTECTED]www.stats.gla.ac.uk/~simon/
  Department of Statistics, University of Glasgow, Glasgow, G12 8QQ
   Direct telephone: (0)141 330 4530  Fax: (0)141 330 4814

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] GAM: Remedial measures

2005-01-13 Thread Jean G. Orelien
I fitted a GAM model with Poisson distribution to a data with about 200
observations.  I noticed that the plot of the residuals versus fitted values
show a trend.  Residuals tend to be lower for higher fitted values. Because,
I'm dealing with count data, I'm thinking that this might be due to
overdispersion.  Is there a way to account for overdispersion in any of the
packages MGCV or GAM?  

 

I welcome any suggestions that one may have on this topic.

 

Jean

 

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] GAM: Remedial measures

2005-01-13 Thread Tim F Liao
Jean,

The standard treatment of overdispersed data when using the
Poisson distribution to model count data is to switch to the
negative binomial distribution.  Hope this helps,

Tim Liao

 Original message 
Date: Thu, 13 Jan 2005 18:22:29 -0500
From: Jean G. Orelien [EMAIL PROTECTED]  
Subject: [R] GAM: Remedial measures  
To: r-help@stat.math.ethz.ch

I fitted a GAM model with Poisson distribution to a data with
about 200
observations.  I noticed that the plot of the residuals
versus fitted values
show a trend.  Residuals tend to be lower for higher fitted
values. Because,
I'm dealing with count data, I'm thinking that this might be
due to
overdispersion.  Is there a way to account for overdispersion
in any of the
packages MGCV or GAM?  

 

I welcome any suggestions that one may have on this topic.

 

Jean

 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] GAM: Getting standard errors from the parametric terms in a GAM model

2004-12-22 Thread Simon Wood
summary.gam and anova.gam in package mgcv will report standard errors and 
p-values for parametric terms, as well as smooth terms, for a gam fitted 
by function gam from package mgcv. 

Simon

 I am new to R.  I'm using the function GAM and wanted to get standard errors
 and p-values for the parametric terms (I fitted a semi-parametric models).
 Using the function anova() on the object from GAM, I only get  p-values for
 the nonparametric terms.
 
  
 
 Does anyone know if and how to get standard errors for the parametric terms?
 
  
 
 Thanks.
 
  
 
 Jean G. Orelien
 
 
 
  
 


__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] GAM: Overfitting

2004-12-22 Thread Simon Wood
 I am analyzing particulate matter data (PM10) on a small data set (147
 observations).  I fitted a semi-parametric model and am worried about
 overfitting.  How can one check for model fit in GAM?

- Keeping a random subset of the data as a validation set,  fitting 
to the remaining data and then comparing the R^2/ proportion deviance explained 
on fit set and validation set is usually quite diagnostic. If the fit data 
are much better predicted than the validation data, then you probably have 
over-fitting. 

- If your response is treated as Poisson then scale parameter estimates 
1 are also diagnostic, but only if you are not expecting overdispersion, 
of course. 

- If you use gam from package mgcv then, by default, model 
effective degrees of freedom are estimated from your data by GCV or an 
approximation to AIC. mgcv::gam allows you to increase the penalty on each 
model degree of freedom in these criteria, via gam argument `gamma'. Some 
work by Kim and Gu (2004, J.Roy.Statist.Soc.B) suggests that gamma around 
1.4 can be a sensible choise for surpressing overfitting, without 
much of a degredation in MSE performance.
 

best,
Simon

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] GAM: Getting standard errors from the parametric terms in a GAM model

2004-12-22 Thread Thomas Lumley
On Tue, 21 Dec 2004, Jean G. Orelien wrote:
I am new to R.  I'm using the function GAM and wanted to get standard errors
and p-values for the parametric terms (I fitted a semi-parametric models).
Using the function anova() on the object from GAM, I only get  p-values for
the nonparametric terms.

Does anyone know if and how to get standard errors for the parametric terms?
If you mean gam() in the gam package then, yes, someone does but it hasn't 
been included in the package yet.  It is described in the current issue of 
JASA.  Code for S-PLUS is supposed to be at
  http://www.ihapss.jsph.edu/software/
but that is currently not working.

-thomas
__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] GAM: Getting standard errors from the parametric terms in a GAM model

2004-12-21 Thread Jean G. Orelien
I am new to R.  I'm using the function GAM and wanted to get standard errors
and p-values for the parametric terms (I fitted a semi-parametric models).
Using the function anova() on the object from GAM, I only get  p-values for
the nonparametric terms.

 

Does anyone know if and how to get standard errors for the parametric terms?

 

Thanks.

 

Jean G. Orelien



 

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] GAM: Overfitting

2004-12-21 Thread Jean G. Orelien
I am analyzing particulate matter data (PM10) on a small data set (147
observations).  I fitted a semi-parametric model and am worried about
overfitting.  How can one check for model fit in GAM?

 

Jean G. Orelien



 

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] GAM: Overfitting

2004-12-21 Thread Frank E Harrell Jr
Jean G. Orelien wrote:
I am analyzing particulate matter data (PM10) on a small data set (147
observations).  I fitted a semi-parametric model and am worried about
overfitting.  How can one check for model fit in GAM?
 

Jean G. Orelien
It's good to separate 'model fit' (or lack of fit) from 'overfitting'. 
Overfitting can cause the model fit to appear to be excellent, but there 
is still a huge problem.

--
Frank E Harrell Jr   Professor and Chair   School of Medicine
 Department of Biostatistics   Vanderbilt University
__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Gam() function in R

2004-12-06 Thread Jari Oksanen
On 6 Dec 2004, at 7:36, Janice Tse wrote:
Thanks for the email. I will check that out
However when I was doing this :gam(y~s(x1)+s(x2,3),  
family=gaussian,
data=mydata )it gives me  the error :

Error in terms.formula(formula, data = data) :
invalid model formula in ExtractVars
What does it mean ?
When Any Liaw answered you (below), he asked you to specify which kind  
of 'gam' did you use: the one in standard package 'mgcv' or the one in  
package 'gam'. We should know this to know what does it mean to get   
your error message. If you used mgcv:::gam, it means that you didn't  
read it help pages which say that you should specify your model as:

gam(y ~ s(x1) + s(x2, k=3))
Further, it may be useful to read the help pages to understand what it  
means to specify k=3 and how it may influence your model. Simon Wood --  
the mgcv author -- also has a very useful article in the R Newsletter:  
see the CRAN archive. It may be really difficult to understand what you  
do when  you do mgcv:::gam unless you read this paper (it is possible,  
but hard). Simon's article specifically answers to your first question  
of deciding the smoothness, and explains how elegantly this is done in  
mgcv:::gam (gam:::gam has another set of tools and philosophy).

If you happened to use gam:::gam, then you have to look at another  
explanation.

cheers, jari oksanen
From: Liaw, Andy [mailto:[EMAIL PROTECTED]
Sent: Sunday, December 05, 2004 11:34 PM
To: 'Janice Tse'; [EMAIL PROTECTED]
Subject: RE: [R] Gam() function in R
Unfortunately that's not really an R question.  I recommend that you  
read up
on the statistical methods underneath.  One that I'd wholeheartedly
recommend is Prof. Harrell's `Regression Modeling Strategies'.

[BTW, there are now two implementations of gam() in R: one in `mgcv',  
which
is fairly different from that in  `gam'.  I'm guessing you're  
referring to
the one in `gam', but please remember to state which contributed  
package
you're using, along with version of R and OS.]

Cheers,
Andy
From: Janice Tse
Hi all,
I'm   a new user of R gam() function. I am wondering how do
we decide on the
smooth function to use?
The general form is gam(y~s(x1,df=i)+s(x2,df=j)...)  , how do we
decide on the degree freedom to use for each smoother, and if we shold
apply smoother to each attribute?
Thanks!!
__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html


--- 
-
--
Notice:  This e-mail message, together with any  
attachments,...{{dropped}}

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!  
http://www.R-project.org/posting-guide.html

--
Jari Oksanen, Oulu, Finland
__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Gam() function in R

2004-12-06 Thread Yves Magliulo
hi all,

this subject is very intersting for me. I'm using mgcv 0.8-9 with R
version 1.7.1. i didn't know that there was an another gam version with
package library(gam). Someone can tell me the basics differences between
them? I look for an help page on google but i only find mgcv help
pages.

thanks!

yves magliulo, Paris. 


Le lun 06/12/2004 à 09:09, Jari Oksanen a écrit :
 On 6 Dec 2004, at 7:36, Janice Tse wrote:
 
  Thanks for the email. I will check that out
 
  However when I was doing this :gam(y~s(x1)+s(x2,3),  
  family=gaussian,
  data=mydata )it gives me  the error :
 
  Error in terms.formula(formula, data = data) :
  invalid model formula in ExtractVars
 
  What does it mean ?
 
 When Any Liaw answered you (below), he asked you to specify which kind  
 of 'gam' did you use: the one in standard package 'mgcv' or the one in  
 package 'gam'. We should know this to know what does it mean to get   
 your error message. If you used mgcv:::gam, it means that you didn't  
 read it help pages which say that you should specify your model as:
 
 gam(y ~ s(x1) + s(x2, k=3))
 
 Further, it may be useful to read the help pages to understand what it  
 means to specify k=3 and how it may influence your model. Simon Wood --  
 the mgcv author -- also has a very useful article in the R Newsletter:  
 see the CRAN archive. It may be really difficult to understand what you  
 do when  you do mgcv:::gam unless you read this paper (it is possible,  
 but hard). Simon's article specifically answers to your first question  
 of deciding the smoothness, and explains how elegantly this is done in  
 mgcv:::gam (gam:::gam has another set of tools and philosophy).
 
 If you happened to use gam:::gam, then you have to look at another  
 explanation.
 
 cheers, jari oksanen
 
  From: Liaw, Andy [mailto:[EMAIL PROTECTED]
  Sent: Sunday, December 05, 2004 11:34 PM
  To: 'Janice Tse'; [EMAIL PROTECTED]
  Subject: RE: [R] Gam() function in R
 
  Unfortunately that's not really an R question.  I recommend that you  
  read up
  on the statistical methods underneath.  One that I'd wholeheartedly
  recommend is Prof. Harrell's `Regression Modeling Strategies'.
 
  [BTW, there are now two implementations of gam() in R: one in `mgcv',  
  which
  is fairly different from that in  `gam'.  I'm guessing you're  
  referring to
  the one in `gam', but please remember to state which contributed  
  package
  you're using, along with version of R and OS.]
 
  Cheers,
  Andy
 
  From: Janice Tse
 
  Hi all,
 
  I'm   a new user of R gam() function. I am wondering how do
  we decide on the
  smooth function to use?
  The general form is gam(y~s(x1,df=i)+s(x2,df=j)...)  , how do we
  decide on the degree freedom to use for each smoother, and if we shold
  apply smoother to each attribute?
 
  Thanks!!
 
  __
  [EMAIL PROTECTED] mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide!
  http://www.R-project.org/posting-guide.html
 
 
 
 
  --- 
  -
  --
  Notice:  This e-mail message, together with any  
  attachments,...{{dropped}}
 
  __
  [EMAIL PROTECTED] mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide!  
  http://www.R-project.org/posting-guide.html
 
 --
 Jari Oksanen, Oulu, Finland
 
 __
 [EMAIL PROTECTED] mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Gam() function in R

2004-12-06 Thread Simon Wood
 I'm   a new user of R gam() function. I am wondering how do we decide on the
 smooth function to use?
 The general form is gam(y~s(x1,df=i)+s(x2,df=j)...)  , how do we decide
 on the degree freedom to use for each smoother, and if we shold apply
 smoother to each attribute?

I guess you are using gam() from package gam, in which case you probably 
need to look at the help file for step.gam. 

By default gam() in package mgcv estimates the appropriate degrees of 
freedom automatically as part of model estimation using generalized cross 
validation, (although there is an adjustable  upper limit on the range 
of degrees of freedom considered).

Package gss also has routines for fitting GAMs where the choise of df is 
fully automatic.

best,
Simon

_
 Simon Wood [EMAIL PROTECTED]www.stats.gla.ac.uk/~simon/
  Department of Statistics, University of Glasgow, Glasgow, G12 8QQ
   Direct telephone: (0)141 330 4530  Fax: (0)141 330 4814

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] Gam() function in R

2004-12-06 Thread Janice Tse
Thank you very much. I am using gam() from mgcv actually. You answered my
question about degree of freedom.

One more question, if I were to compare the results from gam() and glm(),
which numbers are of the greatest interest?  
What if my response variables are binary?

Thanks!
-Janice

-Original Message-
From: Simon Wood [mailto:[EMAIL PROTECTED] 
Sent: Monday, December 06, 2004 5:54 AM
To: Janice Tse
Cc: [EMAIL PROTECTED]
Subject: Re: [R] Gam() function in R

 I'm   a new user of R gam() function. I am wondering how do we decide on
the
 smooth function to use?
 The general form is gam(y~s(x1,df=i)+s(x2,df=j)...)  , how do we 
 decide on the degree freedom to use for each smoother, and if we shold 
 apply smoother to each attribute?

I guess you are using gam() from package gam, in which case you probably
need to look at the help file for step.gam. 

By default gam() in package mgcv estimates the appropriate degrees of
freedom automatically as part of model estimation using generalized cross
validation, (although there is an adjustable  upper limit on the range of
degrees of freedom considered).

Package gss also has routines for fitting GAMs where the choise of df is
fully automatic.

best,
Simon

_
 Simon Wood [EMAIL PROTECTED]www.stats.gla.ac.uk/~simon/
  Department of Statistics, University of Glasgow, Glasgow, G12 8QQ
   Direct telephone: (0)141 330 4530  Fax: (0)141 330 4814

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Gam() function in R

2004-12-06 Thread Henric Nilsson
At 10:48 2004-12-06 +0100, Yves Magliulo wrote:
this subject is very intersting for me. I'm using mgcv 0.8-9 with R
version 1.7.1.
You're in need of an update.
i didn't know that there was an another gam version with
package library(gam).
This is the 'classic' GAM implementation by Hastie  Tibshirani, discussed 
at length in Hastie  Tibshirani (1990) and in the White book.

In fact, other implementations of the GAM concept also exists. Take a look 
at the gss and assist packages; both are at CRAN, and the former is support 
software for Gu's `Smoothing Spline ANOVA Models' book. There's also the 
vgam http://www.stat.auckland.ac.nz/~yee/VGAM/ and SemiPar packages 
http://web.maths.unsw.edu.au/~wand/webspr/rsplus.html; the latter is 
support software for the `Semiparametric Regression' book by Ruppert, Wand 
and Carroll. And there's probably more out there...

 Someone can tell me the basics differences between
them? I look for an help page on google but i only find mgcv help
pages.
Simon Wood (author of the mgcv package) has written a brief but useful 
summary: http://www.stats.gla.ac.uk/~simon/simon/mgcv_overview.html

HTH,
Henric
__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Gam() function in R

2004-12-06 Thread Simon Wood
 this subject is very intersting for me. I'm using mgcv 0.8-9 with R
 version 1.7.1. i didn't know that there was an another gam version with
 package library(gam). Someone can tell me the basics differences between
 them? I look for an help page on google but i only find mgcv help
 pages.

- I think you'd need to move to a newer version of R in order to use 
package gam, but that would also let you use a much more recent version of 
package mgcv. 

- package gam is based very closely on the GAM approach presented in 
Hastie and Tibshirani's  Generalized Additive Models book. Estimation is 
by back-fitting and model selection is based on step-wise regression 
methods based on approximate distributional results. A particular strength 
of this approach is that local regression smoothers (`lo()' terms) can be 
included in GAM models.

- gam in package mgcv represents GAMs using penalized regression splines. 
Estimation is by direct penalized likelihood maximization with 
integrated smoothness estimation via GCV or related criteria (there is 
also an alternative `gamm' function based on a mixed model approach). 
Strengths of the this approach are that s() terms can be functions of more 
than one variable and that tensor product smooths are available via te() 
terms - these are useful when different degrees of smoothness are 
appropriate relative to different arguments of a smooth. 

Here's an attempt at a summary of the differences:

Estimation: gam::gam based on backfitting, mgcv::gam based on direct 
penalized likelihood maximization (with smoothness estimation integrated)

Model selection: package(gam) based on stepwise regression methods. 
mgcv::gam based on integrated GCV estimation of degree of smoothness.

Smooth terms: gam::gam can represent smooth terms using a very wide range 
of scatterplot smoothers incuding loess, which is built in. mgcv::gam is 
restricted to smoothers that can be represented using basis functions and 
an associated ``wiggliness'' penalty, but these include low rank thin 
plate spline smoothers and tensor product smoothers for smooths of more 
than one variable. Both packages provide interfaces for adding new classes 
of smoother. 

Uncertainty estimation: since mgcv GAMs explicitly estimate 
coefficients for each smooth term, it is fairly straightforward to obtain 
a covariance matrix for the model coefficients, which makes further 
variance calcualtions easy. For example predictions with standard errors 
are easily obtained for predictions made with new prediction data. The 
backfitting approach makes variance calculation more difficult (e.g. at 
present s.e.s are not available from gam::predict.gam with new data)

Interface: both packages are based on Trevor Hastie's Chapter 7 of 
Chambers and Hastie. Since Trevor H. wrote package(gam) it's a closer 
implementation than package(mgcv). 

Basically, if you want integrated smoothness selection, an underlying 
parametric representation, or want smooth interactions in your models 
then mgcv is probably worth a try (but I would say that). If you want to 
use local regression smoothers and/or prefer the stepwise selection 
approach then package gam is for you. 

Simon

_
 Simon Wood [EMAIL PROTECTED]www.stats.gla.ac.uk/~simon/
  Department of Statistics, University of Glasgow, Glasgow, G12 8QQ
   Direct telephone: (0)141 330 4530  Fax: (0)141 330 4814

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Gam() function in R

2004-12-06 Thread Yves Magliulo
so mgcv package is the one i need! indeed, i want integrated smoothness
selection and smooth interactions rather than stepwise selection. i have
a lot of predictor, and i use gam to select those who are efficient
and exclude others. (using p-value)

thanks a lot for those precious information.


Le lun 06/12/2004 à 12:41, Simon Wood a écrit :
  this subject is very intersting for me. I'm using mgcv 0.8-9 with R
  version 1.7.1. i didn't know that there was an another gam version with
  package library(gam). Someone can tell me the basics differences between
  them? I look for an help page on google but i only find mgcv help
  pages.
 
 - I think you'd need to move to a newer version of R in order to use 
 package gam, but that would also let you use a much more recent version of 
 package mgcv. 
 
 - package gam is based very closely on the GAM approach presented in 
 Hastie and Tibshirani's  Generalized Additive Models book. Estimation is 
 by back-fitting and model selection is based on step-wise regression 
 methods based on approximate distributional results. A particular strength 
 of this approach is that local regression smoothers (`lo()' terms) can be 
 included in GAM models.
 
 - gam in package mgcv represents GAMs using penalized regression splines. 
 Estimation is by direct penalized likelihood maximization with 
 integrated smoothness estimation via GCV or related criteria (there is 
 also an alternative `gamm' function based on a mixed model approach). 
 Strengths of the this approach are that s() terms can be functions of more 
 than one variable and that tensor product smooths are available via te() 
 terms - these are useful when different degrees of smoothness are 
 appropriate relative to different arguments of a smooth. 
 
 Here's an attempt at a summary of the differences:
 
 Estimation: gam::gam based on backfitting, mgcv::gam based on direct 
 penalized likelihood maximization (with smoothness estimation integrated)
 
 Model selection: package(gam) based on stepwise regression methods. 
 mgcv::gam based on integrated GCV estimation of degree of smoothness.
 
 Smooth terms: gam::gam can represent smooth terms using a very wide range 
 of scatterplot smoothers incuding loess, which is built in. mgcv::gam is 
 restricted to smoothers that can be represented using basis functions and 
 an associated ``wiggliness'' penalty, but these include low rank thin 
 plate spline smoothers and tensor product smoothers for smooths of more 
 than one variable. Both packages provide interfaces for adding new classes 
 of smoother. 
 
 Uncertainty estimation: since mgcv GAMs explicitly estimate 
 coefficients for each smooth term, it is fairly straightforward to obtain 
 a covariance matrix for the model coefficients, which makes further 
 variance calcualtions easy. For example predictions with standard errors 
 are easily obtained for predictions made with new prediction data. The 
 backfitting approach makes variance calculation more difficult (e.g. at 
 present s.e.s are not available from gam::predict.gam with new data)
 
 Interface: both packages are based on Trevor Hastie's Chapter 7 of 
 Chambers and Hastie. Since Trevor H. wrote package(gam) it's a closer 
 implementation than package(mgcv). 
 
 Basically, if you want integrated smoothness selection, an underlying 
 parametric representation, or want smooth interactions in your models 
 then mgcv is probably worth a try (but I would say that). If you want to 
 use local regression smoothers and/or prefer the stepwise selection 
 approach then package gam is for you. 
 
 Simon
 
 _
  Simon Wood [EMAIL PROTECTED]www.stats.gla.ac.uk/~simon/
   Department of Statistics, University of Glasgow, Glasgow, G12 8QQ
Direct telephone: (0)141 330 4530  Fax: (0)141 330 4814
 
 


__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Gam() function in R

2004-12-06 Thread Frank E Harrell Jr
Yves Magliulo wrote:
so mgcv package is the one i need! indeed, i want integrated smoothness
selection and smooth interactions rather than stepwise selection. i have
a lot of predictor, and i use gam to select those who are efficient
and exclude others. (using p-value)
It is interesting that you use P-values but do not care that the 
strategy you use (variable selection as opposed to pre-specifying models 
or just using shrinkage) does not preserve type I error or confidence 
interval coverage probabilities in subsequent analyses with mgcv.
--
Frank E Harrell Jr   Professor and Chair   School of Medicine
 Department of Biostatistics   Vanderbilt University

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Gam() function in R

2004-12-05 Thread Janice Tse
Hi all,

I'm   a new user of R gam() function. I am wondering how do we decide on the
smooth function to use?
The general form is gam(y~s(x1,df=i)+s(x2,df=j)...)  , how do we decide
on the degree freedom to use for each smoother, and if we shold apply
smoother to each attribute?

Thanks!!

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] Gam() function in R

2004-12-05 Thread Liaw, Andy
Unfortunately that's not really an R question.  I recommend that you read up
on the statistical methods underneath.  One that I'd wholeheartedly
recommend is Prof. Harrell's `Regression Modeling Strategies'.

[BTW, there are now two implementations of gam() in R: one in `mgcv', which
is fairly different from that in  `gam'.  I'm guessing you're referring to
the one in `gam', but please remember to state which contributed package
you're using, along with version of R and OS.]

Cheers,
Andy

 From: Janice Tse
 
 Hi all,
 
 I'm   a new user of R gam() function. I am wondering how do 
 we decide on the
 smooth function to use?
 The general form is gam(y~s(x1,df=i)+s(x2,df=j)...)  , 
 how do we decide
 on the degree freedom to use for each smoother, and if we shold apply
 smoother to each attribute?
 
 Thanks!!
 
 __
 [EMAIL PROTECTED] mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html
 


__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] Gam() function in R

2004-12-05 Thread Janice Tse
Thanks for the email. I will check that out

However when I was doing this :gam(y~s(x1)+s(x2,3), family=gaussian,
data=mydata )it gives me  the error :

Error in terms.formula(formula, data = data) : 
invalid model formula in ExtractVars

What does it mean ?

Thanks
-Janice 

-Original Message-
From: Liaw, Andy [mailto:[EMAIL PROTECTED] 
Sent: Sunday, December 05, 2004 11:34 PM
To: 'Janice Tse'; [EMAIL PROTECTED]
Subject: RE: [R] Gam() function in R

Unfortunately that's not really an R question.  I recommend that you read up
on the statistical methods underneath.  One that I'd wholeheartedly
recommend is Prof. Harrell's `Regression Modeling Strategies'.

[BTW, there are now two implementations of gam() in R: one in `mgcv', which
is fairly different from that in  `gam'.  I'm guessing you're referring to
the one in `gam', but please remember to state which contributed package
you're using, along with version of R and OS.]

Cheers,
Andy

 From: Janice Tse
 
 Hi all,
 
 I'm   a new user of R gam() function. I am wondering how do 
 we decide on the
 smooth function to use?
 The general form is gam(y~s(x1,df=i)+s(x2,df=j)...)  , how do we 
 decide on the degree freedom to use for each smoother, and if we shold 
 apply smoother to each attribute?
 
 Thanks!!
 
 __
 [EMAIL PROTECTED] mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html
 
 



--
Notice:  This e-mail message, together with any attachments,...{{dropped}}

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] gam plots

2004-08-25 Thread Paul von Hippel
When smooths fitted by the gam package are plotted, what are the units of 
the vertical axis? Is there a simple way to change these units to units of 
the dependent variable?

Thanks for any suggestions!
Paul von Hippel
Paul von Hippel
Department of Sociology / Initiative in Population Research
Ohio State University
300 Bricker Hall
190 N. Oval Mall
Columbus OH 43210
614 688-3768
Office hours M-Th 3-5pm
__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] gam

2004-06-16 Thread Yves Magliulo
hi,

i'm working with mgcv packages and specially gam. My exemple is:

test-gam(B~s(pred1)+s(pred2))
plot(test,pages=1)

when ploting test, you can view pred1 vs s(pred1, edf[1] )  pred2 vs
s(pred2, edf[2] )

I would like to know if there is a way to access to those terms
(s(pred1)  s(pred2)). Does someone know how?

the purpose is to access to equation of smooths terms in order to have
the equation of my additive model.

best regards,

-- 
Yves Magliulo, Climatology research departement [EMAIL PROTECTED]
Climpact

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] gam

2004-06-16 Thread Simon Wood
 i'm working with mgcv packages and specially gam. My exemple is:

 test-gam(B~s(pred1)+s(pred2))
 plot(test,pages=1)

 when ploting test, you can view pred1 vs s(pred1, edf[1] )  pred2 vs
 s(pred2, edf[2] )

 I would like to know if there is a way to access to those terms
 (s(pred1)  s(pred2)). Does someone know how?

Depends a bit on what sort of access you want... You can use predict.gam
to obtain the estimated value of each smooth  term at any set of pred1 or
pred2 values you supply (along with standard errors).

The underlying equations are somewhat unwieldy, but are given in
Wood, S.N. (2003) Thin plate regression splines. J.R.Statist.Soc.B
65(1):95-114 ... if you need to evaluate the smooth in another program or
something then you'd probably need to transform the t.p.r.s. parameters back
to thin plate spline parameters and use the t.p.s. basis.

- You can also change the smoothing basis to one which is easier to write
down - the cr basis (see ?s) parameterizes a 1-d cubic spline in terms of the
function values at the knots, for example.

- Or you can add a smoothing basis of your own design and hence specify
the equations of the smooth yourself: ?p.spline gives an example.



 the purpose is to access to equation of smooths terms in order to have
 the equation of my additive model.

best,
Simon

_
 Simon Wood [EMAIL PROTECTED]www.stats.gla.ac.uk/~simon/
  Department of Statistics, University of Glasgow, Glasgow, G12 8QQ
   Direct telephone: (0)141 330 4530  Fax: (0)141 330 4814

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] GAM question

2004-06-04 Thread Simon Wood

 Warning in eval(expr, envir, enclos) : non-integer #successes in a
 binomial glm!

- one way of specifying a logistic regression model is to supply the
observed proportion of sucesses as the response variable (e.g. y) and the
binomial n as the weights. The warning is complaining that y/n is
non-integer. Depending on exactly why you are weighting, you might want to
use the quasibinomial family in place of binomial,

 Error: cannot allocate vector of size 60865 Kb

The gam fit may get a bit memory intensive given the number of data you
have. ?gam gives various approaches for dealing with large datasets, but
you might want to change the smoothing basis to one that is
computationally cheaper  than the default.

eg. replace s(x) terms by s(x,bs=cr).

Simon

On Thu, 3 Jun 2004, HILLARY  ROBISON wrote:

 I am trying to use R to do a weighted GAM with PA (presence/random) as the
 response variable (Y, which is a 0 or a 1) and ASPECT (values go from
 0-3340), DEM (from 1500-3300), HLI (from 0-5566), PLAN (from -3 to 3),
 PROF (from -3 to 3), SLOPE (from 100-500) and TRI (from 0-51) as
 predictor variables (Xs).  I need to weight each observation by its WO
 value (from 0.18 to 0.98).  I have specified the following models in R
 (see below), but I can't figure out what the R reported errors plainly
 mean. One of the errors seems to tell me my dataset is too big (it's
 109,729 rows by 16 columns) - is this possible?  Given what I am trying to
 accomplish (a weighted, logistic GAM with 7 variables), am I specifying my
 model correctly?  I would like to attach my dataset (it's 2,064 KB
 as a WinZip file), but I don't know if it'll go through to the list given
 the HTML  attachment contraints of the list...  I even tried a weighted,
 logistic GLM with the seven variables to see if that would work and if so,
 perhaps it was a GAM problem.  I also tried a logistic, weighted GAM with
 one variable to see if that would work.  My next step while I wait to
 hear back from the list is to try a dummy dataset that is small to see if
 a weighted, logistic GAM with seven variables will work at all or if I am
 speciying the model correctly.  Would anyone be willing to have my dataset
 sent so they can check it out if that would help solve the issue?  Thank you!
 Hillary ([EMAIL PROTECTED])

  # trial, all, weighted
  topo8 - gam(PA ~ s(SLOPE10) + s(ASPECT10) + s(GYADEMPLUS) + s(TRI) +
 s(HLI) + s(PLAN10) + s(PROF10), family=binomial, data=topox, weights = w0)
 Warning in eval(expr, envir, enclos) : non-integer #successes in a
 binomial glm!
 Error: cannot allocate vector of size 60865 Kb

  topo9 - glm(PA ~ SLOPE10 + ASPECT10 + GYADEMPLUS + TRI + HLI + PLAN10 +
 PROF10, family=binomial, data=topox, weights = w0)
 Warning in eval(expr, envir, enclos) : non-integer #successes in a
 binomial glm!

  # trial, weighted, slope only
  topo10 - gam(PA ~ s(SLOPE10), family=binomial, data=topox, weights =
 w0)
 Warning in eval(expr, envir, enclos) : non-integer #successes in a
 binomial glm!

 __
 [EMAIL PROTECTED] mailing list
 https://www.stat.math.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] GAM question

2004-06-03 Thread HILLARY ROBISON
I am trying to use R to do a weighted GAM with PA (presence/random) as the
response variable (Y, which is a 0 or a 1) and ASPECT (values go from
0-3340), DEM (from 1500-3300), HLI (from 0-5566), PLAN (from -3 to 3),
PROF (from -3 to 3), SLOPE (from 100-500) and TRI (from 0-51) as
predictor variables (Xs).  I need to weight each observation by its WO
value (from 0.18 to 0.98).  I have specified the following models in R
(see below), but I can't figure out what the R reported errors plainly
mean. One of the errors seems to tell me my dataset is too big (it's
109,729 rows by 16 columns) - is this possible?  Given what I am trying to
accomplish (a weighted, logistic GAM with 7 variables), am I specifying my
model correctly?  I would like to attach my dataset (it's 2,064 KB
as a WinZip file), but I don't know if it'll go through to the list given
the HTML  attachment contraints of the list...  I even tried a weighted,
logistic GLM with the seven variables to see if that would work and if so,
perhaps it was a GAM problem.  I also tried a logistic, weighted GAM with
one variable to see if that would work.  My next step while I wait to
hear back from the list is to try a dummy dataset that is small to see if
a weighted, logistic GAM with seven variables will work at all or if I am
speciying the model correctly.  Would anyone be willing to have my dataset
sent so they can check it out if that would help solve the issue?  Thank you!
Hillary ([EMAIL PROTECTED])

 # trial, all, weighted
 topo8 - gam(PA ~ s(SLOPE10) + s(ASPECT10) + s(GYADEMPLUS) + s(TRI) +
s(HLI) + s(PLAN10) + s(PROF10), family=binomial, data=topox, weights = w0)
Warning in eval(expr, envir, enclos) : non-integer #successes in a
binomial glm!
Error: cannot allocate vector of size 60865 Kb

 topo9 - glm(PA ~ SLOPE10 + ASPECT10 + GYADEMPLUS + TRI + HLI + PLAN10 +
PROF10, family=binomial, data=topox, weights = w0)
Warning in eval(expr, envir, enclos) : non-integer #successes in a
binomial glm!

 # trial, weighted, slope only
 topo10 - gam(PA ~ s(SLOPE10), family=binomial, data=topox, weights =
w0)
Warning in eval(expr, envir, enclos) : non-integer #successes in a
binomial glm!

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] GAM with Locfit components

2004-04-06 Thread Simon Wood
From mgcv 1.0 you can add your own smoothers for use with gam, but only if
they can be represented using basis functions and a quadratic penalty: so
P-splines are OK, but loess is not, for example.

Simon

 Loader's book is referring to the gam() function in S-plus (or S from Bell
 Labs), not the one in mgcv.  They are very different things.  I don't know
 if it's possible to implement local regression type smoothers (or something
 other than splines) in gam() in mgcv, but even if it's possible, the one in
 locfit won't work without quite a bit of work, I'd imagine.

 Andy

  From: Vivian Viallon
 
  Hi,
  I'm trying to combine the Locfit Package with the Mgcv package (to use
  Generalized Additive Models with Locfit components).  I read the book
  written by Clive Loader  where it's said that, for the S
  language, you just
  have to load the locfit package using the command :
  Library(locfit, first=T)
  in order to use locfit components in an additive model.
  But I can't. I guess the C-command differs from the S-command.
  Thanks in  advance for your help.
  Regards,
  Vivian
 
  __
  [EMAIL PROTECTED] mailing list
  https://www.stat.math.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide!
  http://www.R-project.org/posting-guide.html
 
 

 __
 [EMAIL PROTECTED] mailing list
 https://www.stat.math.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] GAM with Locfit components

2004-04-05 Thread Vivian Viallon
Hi,
I’m trying to combine the Locfit Package with the Mgcv package (to use
Generalized Additive Models with Locfit components).  I read the book
written by Clive Loader  where it’s said that, for the S language, you just
have to “load” the locfit package using the command :
Library(locfit, first=”T”)
in order to use locfit components in an additive model.
But I can’t. I guess the C-command differs from the S-command.
Thanks in  advance for your help.
Regards,
Vivian

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] GAM with Locfit components

2004-04-05 Thread Liaw, Andy
Loader's book is referring to the gam() function in S-plus (or S from Bell
Labs), not the one in mgcv.  They are very different things.  I don't know
if it's possible to implement local regression type smoothers (or something
other than splines) in gam() in mgcv, but even if it's possible, the one in
locfit won't work without quite a bit of work, I'd imagine.

Andy

 From: Vivian Viallon
 
 Hi,
 I'm trying to combine the Locfit Package with the Mgcv package (to use
 Generalized Additive Models with Locfit components).  I read the book
 written by Clive Loader  where it's said that, for the S 
 language, you just
 have to load the locfit package using the command :
 Library(locfit, first=T)
 in order to use locfit components in an additive model.
 But I can't. I guess the C-command differs from the S-command.
 Thanks in  advance for your help.
 Regards,
 Vivian
 
 __
 [EMAIL PROTECTED] mailing list
 https://www.stat.math.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html
 


__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R]gam and concurvity

2003-09-17 Thread Simon Wood
 in the paper Avoiding the effects of concurvity in GAM's .. of
 Figueiras et al. (2003) it is mentioned that in GLM collinearity is taken 
 into account in the calc of se but not in GAM (- results in confidence 
 interval too narrow, p-value understated,  GAM S-Plus version). I haven't 
 found any references to GAM and concurvity or collinearity on the R page. 
 And I wonder if the R  version of Gam differ in this point.

- the penalized regression spline representation means that it's easy to
calculate the `correct' s.e.'s and this is what is done. The covariance
matrix used is based on a Bayesian model of smoothing, generalized from
Silverman (1985), JRSSB (and less closely, Wahba, 1983, JRSSB), so the
s.e.'s are generally a little larger than you'd get if you just pretended
that the GAM was an un-penalized GLM (this widening generally improves CI
performance). 

As Thomas Lumley pointed out, the s.e.'s don't take into account smoothing
parameter estimation uncertainty. In simulation studies this
uncertainty seems to have very little effect on the realized coverage
probabilities of Confidence Interval's that are in some sense `whole
model' intervals, but the performance of CI's for component functions of
the GAM can be quite a long way from nominal. There's a simple
`not-very-computer-intensive' fix for this which removes the conditioning
on the smoothing parameters and greatly improves component-wise coverage
probabilities implementation is on my `to-do' list (might wait to see
what the referees say though!)

Simon 

ps. mgcv 0.9 out now! (changes list linked to my www page)
_
 Simon Wood [EMAIL PROTECTED]www.stats.gla.ac.uk/~simon/
  Department of Statistics, University of Glasgow, Glasgow, G12 8QQ
   Direct telephone: (0)141 330 4530  Fax: (0)141 330 4814

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


[R] gam and concurvity

2003-09-16 Thread Martin Wegmann
Hello, 

in the paper Avoiding the effects of concurvity in GAM's .. of Figueiras et 
al. (2003) it is mentioned that in GLM collinearity is taken into account in 
the calc of se but not in GAM (- results in confidence interval too narrow, 
p-value understated,  GAM S-Plus version). I haven't found any references to 
GAM and concurvity or collinearity on the R page. And I wonder if the R 
version of Gam differ in this point.
Another question would be, what the best manual way of a variable selection 
is, due to the lack of a stepwise procedure for GAM. Including the first 
variables, add var1, if GCV improves (what would be considered as 
improvement?) or P-value signif., keep it, otherwise drop it - add var 2, and 
so on?

thanks in advance, cheers Martin

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


R: [R] gam and concurvity

2003-09-16 Thread Vito Muggeo
As someone (Simon Wood, for instance) could explain much better and as it is
stressed in the help files of the mgcv pakage (the package  including the
gam() function)
gam in R is not a clone of gam in S+.
S+ uses backfitting while R uses penalized splines (see the references
inside gam() function). The approaches are quite different and can lead to
substantial differences in particular cases, for instance with concurvity.

best,
vito

PS Can you point out the exact reference for Figueiras et al. (2003)?


- Original Message -
From: Martin Wegmann [EMAIL PROTECTED]
To: R-list [EMAIL PROTECTED]
Sent: Tuesday, September 16, 2003 3:47 PM
Subject: [R] gam and concurvity


 Hello,

 in the paper Avoiding the effects of concurvity in GAM's .. of Figueiras
et
 al. (2003) it is mentioned that in GLM collinearity is taken into account
in
 the calc of se but not in GAM (- results in confidence interval too
narrow,
 p-value understated,  GAM S-Plus version). I haven't found any references
to
 GAM and concurvity or collinearity on the R page. And I wonder if the R
 version of Gam differ in this point.
 Another question would be, what the best manual way of a variable
selection
 is, due to the lack of a stepwise procedure for GAM. Including the first
 variables, add var1, if GCV improves (what would be considered as
 improvement?) or P-value signif., keep it, otherwise drop it - add var 2,
and
 so on?

 thanks in advance, cheers Martin

 __
 [EMAIL PROTECTED] mailing list
 https://www.stat.math.ethz.ch/mailman/listinfo/r-help

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


Re: R: [R] gam and concurvity

2003-09-16 Thread Martin Wegmann
On Tuesday 16 September 2003 16:28, Vito Muggeo wrote:
 As someone (Simon Wood, for instance) could explain much better and as it
 is stressed in the help files of the mgcv pakage (the package  including
 the gam() function)
 gam in R is not a clone of gam in S+.
 S+ uses backfitting while R uses penalized splines (see the references
 inside gam() function). The approaches are quite different and can lead to
 substantial differences in particular cases, for instance with concurvity.

 best,
 vito

 PS Can you point out the exact reference for Figueiras et al. (2003)?

I haven't found a journal name but the *.pdf download is 
http://isi-eh.usc.es/trabajos/110_70_fullpaper.pdf 


 - Original Message -
 From: Martin Wegmann [EMAIL PROTECTED]
 To: R-list [EMAIL PROTECTED]
 Sent: Tuesday, September 16, 2003 3:47 PM
 Subject: [R] gam and concurvity

  Hello,
 
  in the paper Avoiding the effects of concurvity in GAM's .. of
  Figueiras

 et

  al. (2003) it is mentioned that in GLM collinearity is taken into account

 in

  the calc of se but not in GAM (- results in confidence interval too

 narrow,

  p-value understated,  GAM S-Plus version). I haven't found any references

 to

  GAM and concurvity or collinearity on the R page. And I wonder if the R
  version of Gam differ in this point.
  Another question would be, what the best manual way of a variable

 selection

  is, due to the lack of a stepwise procedure for GAM. Including the first
  variables, add var1, if GCV improves (what would be considered as
  improvement?) or P-value signif., keep it, otherwise drop it - add var 2,

 and

  so on?
 
  thanks in advance, cheers Martin
 
  __
  [EMAIL PROTECTED] mailing list
  https://www.stat.math.ethz.ch/mailman/listinfo/r-help

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


[R] gam step in grasper

2003-08-22 Thread Martin Wegmann
hello, 

some weeks ago I asked if there is an equivalent of step(lm) for gam, and 
Simon Wood informed me that there isn't but it will eventually be done.

Now I found grasp.step.gam {grasper} and wonder if it would be possible to 
rewrite/extract (? I don't now how it is called or how it works - I am not 
experienced in programming) this command for the use outside GRASP-R. 
Perhaps this way is less work intensive but of course this command doesn't use 
GCV/UBRE scores but anova.
Is there an argument not to use this function inside GRASP-R even though it's 
purpose is not spatial?

thanks for your help, cheers Martin

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


[R] gam and step

2003-07-14 Thread Martin Wegmann
hello, 

I am looking for a step() function for GAM's.
In the book Statistical Computing by Crawley and a removal of predictors  has 
been done by hand

model - gam(y ~s(x1) +s(x2) + s(x3))
summary(model) 
model2 - gam(y ~s(x2) + s(x3)) # removal of the unsignificant variable
#then comparing these two models if an significant increase occurs.
anova(model, model2, test=F)

isn't there a way to drop and add variables automatically until the best model 
is received? like in step(lm(...))? 
Or as in grasp.step.gam() - but that doesn't work when I tried it outside 
GRASP-R.

thanks for your help, cheers Martin

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


Re: [R] gam and step

2003-07-14 Thread Simon Wood
There isn't a step.gam() in mgcv yet it is one of the things that I'd
like to do eventually, although it will probably be based on comparison of
GCV/UBRE scores rather than H_0 testing. 

best,
Simon

 I am looking for a step() function for GAM's.
 In the book Statistical Computing by Crawley and a removal of predictors  has 
 been done by hand
 
 model - gam(y ~s(x1) +s(x2) + s(x3))
 summary(model) 
 model2 - gam(y ~s(x2) + s(x3)) # removal of the unsignificant variable
 #then comparing these two models if an significant increase occurs.
 anova(model, model2, test=F)
 
 isn't there a way to drop and add variables automatically until the best model 
 is received? like in step(lm(...))? 
 Or as in grasp.step.gam() - but that doesn't work when I tried it outside 
 GRASP-R.
 
 thanks for your help, cheers Martin
 
 __
 [EMAIL PROTECTED] mailing list
 https://www.stat.math.ethz.ch/mailman/listinfo/r-help
 
_
 Simon Wood [EMAIL PROTECTED]www.stats.gla.ac.uk/~simon/
  Department of Statistics, University of Glasgow, Glasgow, G12 8QQ
   Direct telephone: (0)141 330 4530  Fax: (0)141 330 4814

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


Re: [R] gam()

2003-06-06 Thread John Fox
Dear Henric,

At 05:01 PM 6/4/2003 +0200, Henric Nilsson wrote:

I've now spent a couple of days trying to learn R and, in particular, the 
gam() function, and I now have a few questions and reflections regarding 
the latter. Maybe these things are implemented in some way that I'm not 
yet aware of or have perhaps been decided by the R community to not be 
what's wanted. Of course, my lack of complete theoretical understanding of 
what mgcv really does may also show...

1. When fitting models where a factor interacts with a smooth term, say 
y~a+s(x,by=a.1)+s(x,by=a.2), I noticed that the rug in the plot of each of 
the smooth terms is identical. I expected the rug in the plot of e.g. 
s(x,by=a.1) to only include those x for which a.1=1 to be able to judge if 
observations of x where a.1=1 are sparse in any region. Also, it would be 
really if nice the by=... was included in the output of the plot.gam() 
and the Approximate significance of smooth terms: part of the summary.gam().

2. John Fox has modified anova.glm() into anova.gam() 
(http://www.socsci.mcmaster.ca/jfox/Books/Companion/nonparametric-regression.txt) 
for comparison of two or more fitted models based on the difference 
between residual deviances. Indiscriminate use of such a procedure 
shouldn't perhaps be encouraged, but I think that many users expect it to 
be part of the mgcv package since this model selection idea is covered in 
several texts and also implemented in S-plus (and may be OK for truly 
nested models). And even if it's been decided that this functionality is 
not wanted in mgcv, perhaps another function comparing several models by 
the GCV/UBRE score and other useful statistics can be implemented?
The problem with comparing two gams in R fit with mgcv is that, by default, 
the degree of smoothing for terms is selected independently for each model. 
Simon Wood previously posted a message to the R-help list discussing this 
issue and making some suggestions. The issue doesn't arise in the same way 
with models fit by the gam function in S-PLUS because the degree of 
smoothing there is instead selected by the user. I should update my 
appendix on nonparametric regression to discuss this question -- the 
current presentation isn't really adequate.


3. Some authors [1, 2] suggests pointwise estimation of odds ratios and 
corresponding confidence intervals based on the smooth terms in a GAM. 
Maybe something for mgcv?
[1] Figueiras, A.  Cadarso-Suárez C. (2001) Application of Nonparametric 
Models for calculating Odds Ratios and Their Confidence Intervals for 
Continuous Exposures, American Journal of Epidemiology, 154(3), 264-275.
[2] Saez, M., Cadarso-Suárez C.  Figueiras, A. (2003) np.OR: an S-Plus 
function for pointwise nonparametric estimation of odds-ratios of 
continuous predictors, Computer Methods and Programs in Biomedicine, 71, 
175-179.

4. For each purely parametric covariate a t-test is produced; I'd like to 
have something like S-plus' anova.gam() to get an overall test. (Perhaps 
with the addition of a choice between Type I and Type III tests, but I 
guess that may be controversial). Is it possible?


John

-
John Fox
Department of Sociology
McMaster University
Hamilton, Ontario, Canada L8S 4M4
email: [EMAIL PROTECTED]
phone: 905-525-9140x23604
web: www.socsci.mcmaster.ca/jfox
__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


Re: [R] gam()

2003-06-06 Thread Henric Nilsson
At 11:12 2003-06-05 -0400, John Fox wrote:

2. John Fox has modified anova.glm() into anova.gam() 
(http://www.socsci.mcmaster.ca/jfox/Books/Companion/nonparametric-regression.txt) 
for comparison of two or more fitted models based on the difference 
between residual deviances. Indiscriminate use of such a procedure 
shouldn't perhaps be encouraged, but I think that many users expect it to 
be part of the mgcv package since this model selection idea is covered in 
several texts and also implemented in S-plus (and may be OK for truly 
nested models). And even if it's been decided that this functionality is 
not wanted in mgcv, perhaps another function comparing several models by 
the GCV/UBRE score and other useful statistics can be implemented?
The problem with comparing two gams in R fit with mgcv is that, by 
default, the degree of smoothing for terms is selected independently for 
each model. Simon Wood previously posted a message to the R-help list 
discussing this issue and making some suggestions. The issue doesn't arise 
in the same way with models fit by the gam function in S-PLUS because the 
degree of smoothing there is instead selected by the user. I should update 
my appendix on nonparametric regression to discuss this question -- the 
current presentation isn't really adequate.
I'm aware of this difference between gam() in R and S-Plus, which is why I 
proposed a function listing relevant statistics for every fitted model so 
the analyst can use these to judge, without hypothesis testing, which model 
to prefer. Still, for models where the analyst has made sure that the 
models are truly nested, the use of your anova.gam can be justified by the 
simulation results reported by Hastie  Tibshirani (1990, p. 155); maybe I 
just want it for purely nostalgic reasons?! ;-)

Admittedly, I like the more attractive way of chosing the degrees of 
freedom that mgcv provides. However, I must admit that since most text 
books covering GAMs are more or less Splus based, and the possibilities 
that mgcv offers are so vast, I'm feeling a bit lost at times; it's great 
to have to new more flexible tools, but on the downside that means more 
choices to be made. So, anyone got any essential literature tips? I've read 
(and re-read, and read again) Simon Wood's articles in JRSS, R News and 
Ecological Modelling, and, of course, the mgcv manual.

//Henric

---
Henric Nilsson, Statistician
Statisticon AB, Östra Ågatan 31, SE-753 22 UPPSALA
Phone (Direct): +46 (0)18 18 22 37
Mobile: +46 (0)70 211 68 36
Fax: +46 (0)18 18 22 33
http://www.statisticon.se

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


[R] gam()

2003-06-05 Thread Henric Nilsson
Dear all,

I've now spent a couple of days trying to learn R and, in particular, the 
gam() function, and I now have a few questions and reflections regarding 
the latter. Maybe these things are implemented in some way that I'm not yet 
aware of or have perhaps been decided by the R community to not be what's 
wanted. Of course, my lack of complete theoretical understanding of what 
mgcv really does may also show...

1. When fitting models where a factor interacts with a smooth term, say 
y~a+s(x,by=a.1)+s(x,by=a.2), I noticed that the rug in the plot of each of 
the smooth terms is identical. I expected the rug in the plot of e.g. 
s(x,by=a.1) to only include those x for which a.1=1 to be able to judge if 
observations of x where a.1=1 are sparse in any region. Also, it would be 
really if nice the by=... was included in the output of the plot.gam() 
and the Approximate significance of smooth terms: part of the summary.gam().

2. John Fox has modified anova.glm() into anova.gam() 
(http://www.socsci.mcmaster.ca/jfox/Books/Companion/nonparametric-regression.txt) 
for comparison of two or more fitted models based on the difference between 
residual deviances. Indiscriminate use of such a procedure shouldn't 
perhaps be encouraged, but I think that many users expect it to be part of 
the mgcv package since this model selection idea is covered in several 
texts and also implemented in S-plus (and may be OK for truly nested 
models). And even if it's been decided that this functionality is not 
wanted in mgcv, perhaps another function comparing several models by the 
GCV/UBRE score and other useful statistics can be implemented?

3. Some authors [1, 2] suggests pointwise estimation of odds ratios and 
corresponding confidence intervals based on the smooth terms in a GAM. 
Maybe something for mgcv?
[1] Figueiras, A.  Cadarso-Suárez C. (2001) Application of Nonparametric 
Models for calculating Odds Ratios and Their Confidence Intervals for 
Continuous Exposures, American Journal of Epidemiology, 154(3), 264-275.
[2] Saez, M., Cadarso-Suárez C.  Figueiras, A. (2003) np.OR: an S-Plus 
function for pointwise nonparametric estimation of odds-ratios of 
continuous predictors, Computer Methods and Programs in Biomedicine, 71, 
175-179.

4. For each purely parametric covariate a t-test is produced; I'd like to 
have something like S-plus' anova.gam() to get an overall test. (Perhaps 
with the addition of a choice between Type I and Type III tests, but I 
guess that may be controversial). Is it possible?

//Henric

---
Henric Nilsson, Statistician
Statisticon AB, Östra Ågatan 31, SE-753 22 UPPSALA
Phone (Direct): +46 (0)18 18 22 37
Mobile: +46 (0)70 211 68 36
Fax: +46 (0)18 18 22 33
http://www.statisticon.se

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


[R] gam questions

2003-06-03 Thread Henric Nilsson
Dear all,

I'm a fairly new R user having two questions regarding gam:

1. The prediction example on p. 38 in the mgcv manual. In order to get 
predictions based on the original data set, by leaving out the 'newdata' 
argument (newd in the example), I get an error message

Warning message: the condition has length  1 and only the first element 
will be used in: if (object$dim == 0) m - 0 else m - length(object$sp)

I suspected that it had somthing to do with the data not being attached, 
but when fitting a gam with an attached data set I got the same error. Why?

2. I've fitted a glm y~a+x+a:x, where a is a 3 level factor and x is a 
continuous covariate. If I want to fit a similar gam model, is it correct 
to fit y~a+s(x)+s(x,by=a.1)+s(x,by=a.2)+s(x,by=a.3), where a.1--a.3 are 
dummy variables representing each level of the factor? Or is the s(x) term 
redundant?

Any hints are greatly appreciated.

Best wishes,
Henric
---
Henric Nilsson, Statistician
Statisticon AB, Östra Ågatan 31, SE-753 22 UPPSALA
Phone (Direct): +46 (0)18 18 22 37
Mobile: +46 (0)70 211 68 36
Fax: +46 (0)18 18 22 33
http://www.statisticon.se

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


Re: [R] gam questions

2003-06-03 Thread Prof Brian Ripley
On Tue, 3 Jun 2003, Henric Nilsson wrote:

 I'm a fairly new R user having two questions regarding gam:
 
 1. The prediction example on p. 38 in the mgcv manual. 

That's not very helpful: pagination of manuals depends on the paper size, 
for example.

 In order to get 
 predictions based on the original data set, by leaving out the 'newdata' 
 argument (newd in the example), I get an error message
 
 Warning message: the condition has length  1 and only the first element 
 will be used in: if (object$dim == 0) m - 0 else m - length(object$sp)
 
 I suspected that it had somthing to do with the data not being attached, 
 but when fitting a gam with an attached data set I got the same error. Why?

That is a warning not an error!  It was a bug in mgcv a while back.
Do you have the latest version of mgcv (and of R, for that matter)?
I am not seeing this any more.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


Re: [R] gam questions

2003-06-03 Thread Simon Wood
 2. I've fitted a glm y~a+x+a:x, where a is a 3 level factor and x is a 
 continuous covariate. If I want to fit a similar gam model, is it correct 
 to fit y~a+s(x)+s(x,by=a.1)+s(x,by=a.2)+s(x,by=a.3), where a.1--a.3 are 
 dummy variables representing each level of the factor? Or is the s(x) term 
 redundant?
- yes the s(x) term is redundant and in the current mgcv version will
likely cause spectacular nonsense as a result of lack of identifiability
in the smooth part of the model (mgcv 0.9 will cope with this when
released, but it's still much better to use an identifiable smooth
model). 

Simon
_
 Simon Wood [EMAIL PROTECTED]www.stats.gla.ac.uk/~simon/
  Department of Statistics, University of Glasgow, Glasgow, G12 8QQ
   Direct telephone: (0)141 330 4530  Fax: (0)141 330 4814

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


Re: [R] gam questions

2003-06-03 Thread Henric Nilsson
At 10:07 2003-06-03 +0100, Prof Brian Ripley wrote:

 1. The prediction example on p. 38 in the mgcv manual.
That's not very helpful: pagination of manuals depends on the paper size,
for example.
It's the gam.predict example. Instead of pred-predict.gam(b,newd) I tried 
pred-predict.gam(b).

Do you have the latest version of mgcv (and of R, for that matter)?
I'm running R 1.7.0 under Windows 2000 with mgcv 0.8-8.

Best,
Henric
---
Henric Nilsson, Statistician
Statisticon AB, Östra Ågatan 31, SE-753 22 UPPSALA
Phone (Direct): +46 (0)18 18 22 37
Mobile: +46 (0)70 211 68 36
Fax: +46 (0)18 18 22 33
http://www.statisticon.se

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


Re: [R] gam questions

2003-06-03 Thread Simon Wood
 It's the gam.predict example. Instead of pred-predict.gam(b,newd) I tried 
 pred-predict.gam(b).
- Ok, thanks - this is a bug I missed, I'll fix it. The results should be
unaffected, though. Simon

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


[R] GAM with Thin plate splines

2003-01-09 Thread rkales
Hello, I'm a student at the University of Klagenfurt / Austria and I 
need some help !
I have to predict 24 daily load-values.
Therefor I got a dataset with following colums:
24 past daily load-values
6  past daily temperature-values

My goal is to find a model (GAM with thin plate splines) in R.
I found the function gam in the R-library mgcv, but it just fits 
one-dimensial splines.

So my question is, either if it's possible to modify this function, if yes
how, or if there is another function that gives me a solution ?

Please send me a mail, if you can help me !
Thanks

__
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help



Re: [R] GAM with Thin plate splines

2003-01-09 Thread Simon Wood
 My goal is to find a model (GAM with thin plate splines) in R.
 I found the function gam in the R-library mgcv, but it just fits 
 one-dimensial splines.
- Unless you have an exceedingly ancient version of mgcv (0.6), it
*does* allow spline smooths of more than one variable. ?gam contains a
couple of examples as does ?gam.side.conditions. 
Simon

__
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help



Re: [R] GAM with Thin plate splines

2003-01-09 Thread Simon Wood
The default basis for smooth terms in mgcv is a truncated thin plate
spline basis, and has been since the later part of 2001. Including terms
like `s(x,z)', `s(x0,x1,x2)' in the model formula is the way to include
such terms (see help files, and examples therin). 
best, Simon

 Last time I worked with it (last year) there was no tps. And rkales is
 not finding it also.
 
 EJ
 
 On Thu, 2003-01-09 at 10:55, Peter Dalgaard BSA wrote:
  Ernesto Jardim [EMAIL PROTECTED] writes:
  
   Hi
   
   Im package gss there are functions for tps, see ssanova.
  
  gam in mgcv fits thin plate splines, where was the problem???
 -- 
 Ernesto Jardim [EMAIL PROTECTED]
 Marine Biologist
 Research Institute for Agriculture and Fisheries
 Lisboa, Portugal
 Tel: +351 213 027 000
 Fax: +351 213 015 948
 
 __
 [EMAIL PROTECTED] mailing list
 http://www.stat.math.ethz.ch/mailman/listinfo/r-help


__
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help