Re: [R] Factor variables with GAM models

2010-04-14 Thread Gavin Simpson
On Fri, 2010-03-19 at 20:37 -0700, Steven McKinney wrote:
 Hi Noah
 
 GAM models were developed to assess the functional form
 of the relationship of continuous predictor variables to the
 response, so weren't really meant to handle factor variables
 as predictor variables.  GAMs are of the form
 E(Y | X1, X2, ...) = So + S(X1) + S(X2) + ...
 where S(X) is a smooth function of X.

But there is absolutely nothing wrong with including factors in
mgcv::gam - they get expanded into the usually dummy variables depending
on the current contrasts as part of the model set-up routines just like
they do in lm(). Perhaps semiparametric might be a better description of
such a model but at least one implementation of GAMs in R can certainly
handle factors.

I haven't used gam::gam so can't comment on that and the OP doesn't say
which gam he is using.

HTH

G

 Hence you might want to rethink why you'd want a
 factor variable as a predictor variable in a GAM.
 This is why the gam machinery doesn't just do the
 factor conversion to indicator variables as is done in
 lm.
 
 HTH
 
 Steven McKinney
 
 
 From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf 
 Of Noah Silverman [n...@smartmediacorp.com]
 Sent: March 19, 2010 12:54 PM
 To: r-help@r-project.org
 Subject: [R] Factor variables with GAM models
 
 I'm just starting to learn about GAM models.
 
 When using the lm function in R, any factors I have in my data set are
 automatically converted into a series of binomial variables.
 
 For example, if I have a data.frame with a column named color and values
 red, green, blue.   The lm function automatically replaces it with
 3 variables colorred, colorgreen, colorblue which are binomial {0,1}
 
 When I use the gam function, R doesn't do this so I get an error.
 
 1) Is there a way to ask the gam function to do this conversion for me?
 2) If not, is there some other tool or utility to make this data
 transformation easy?
 3) Last option - can I use lm to transform the data and then extract it
 into a new data.frame to then pass to gam?
 
 Thanks!!!
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Dr. Gavin Simpson [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,  [f] +44 (0)20 7679 0565
 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London  [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Factor variables with GAM models

2010-03-22 Thread Simon Wood
It doesn't usually make much sense to *smooth* over a factor variable (in the 
cases where it does you should treat the factor as a random effect), but 
there is no problem in including factor variables in a GAM. `gam' lets you 
mix factor and continuous variables in a bunch of ways. Suppose that `a' is a 
factor, `x' is a continuous (or just metric) variable and `y' is a 
response

y ~ a + s(x)

will fit a model where `a' is treated exactly as a factor variable is treated 
by `lm', while `x' is smoothed over. In mgcv:gam then 

y ~ s(x,by=a)

would create a `smooth-factor interaction' --- a separate smooth of `x' for 
each level of `a'. 

y ~ s(x,by=a,id=1)

would do the same, but would insist on each of the smooths of `x' having the 
same smoothng parameter. ?gam.models gives some more detail. 

best,
Simon

On Friday 19 March 2010 19:54, Noah Silverman wrote:
 I'm just starting to learn about GAM models.

 When using the lm function in R, any factors I have in my data set are
 automatically converted into a series of binomial variables.

 For example, if I have a data.frame with a column named color and values
 red, green, blue.   The lm function automatically replaces it with
 3 variables colorred, colorgreen, colorblue which are binomial {0,1}

 When I use the gam function, R doesn't do this so I get an error.

 1) Is there a way to ask the gam function to do this conversion for me?
 2) If not, is there some other tool or utility to make this data
 transformation easy?
 3) Last option - can I use lm to transform the data and then extract it
 into a new data.frame to then pass to gam?

 Thanks!!!

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html and provide commented, minimal,
 self-contained, reproducible code.

-- 
 Simon Wood, Mathematical Sciences, University of Bath, Bath, BA2 7AY UK
 +44 1225 386603  www.maths.bath.ac.uk/~sw283

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Factor variables with GAM models

2010-03-20 Thread Corrado
You can some time manually substitute a categorical variable with a set 
of continuous variables.


For example, you have the variables like landcover.class with 3 values 
class A, class B, class C. You cna transform it into 3 continuous 
variables landcover.class.A, landcover.class.B, landcover.class.C and 
assign a value of 1 or 100% for elements belonging to that class or of 0 
for elements not belonging.


That help some time.

Regards

Noah Silverman wrote:

Steve,

I get that.  What you wrote make sense.

My challenge is the data I'm attempting to model.  Some of the 
variables are continuous, some are factors.  both linear and poisson 
models work. (Poisson doing a much more accurate job.)  However, some 
of the numerical variables are clearly non-linear.  Hence my interest 
in GAM.  I suppose one alternative would be to try some polynomial 
transformation on the variable as part of a Poisson model.


Any other suggestions would be welcome.

Thanks!

-N

On 3/19/10 8:37 PM, Steven McKinney wrote:

Hi Noah

GAM models were developed to assess the functional form
of the relationship of continuous predictor variables to the
response, so weren't really meant to handle factor variables
as predictor variables.  GAMs are of the form
E(Y | X1, X2, ...) = So + S(X1) + S(X2) + ...
where S(X) is a smooth function of X.

Hence you might want to rethink why you'd want a
factor variable as a predictor variable in a GAM.
This is why the gam machinery doesn't just do the
factor conversion to indicator variables as is done in
lm.

HTH

Steven McKinney


From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On 
Behalf Of Noah Silverman [n...@smartmediacorp.com]

Sent: March 19, 2010 12:54 PM
To: r-help@r-project.org
Subject: [R] Factor variables with GAM models

I'm just starting to learn about GAM models.

When using the lm function in R, any factors I have in my data set are
automatically converted into a series of binomial variables.

For example, if I have a data.frame with a column named color and values
red, green, blue.   The lm function automatically replaces it with
3 variables colorred, colorgreen, colorblue which are binomial {0,1}

When I use the gam function, R doesn't do this so I get an error.

1) Is there a way to ask the gam function to do this conversion for me?
2) If not, is there some other tool or utility to make this data
transformation easy?
3) Last option - can I use lm to transform the data and then extract it
into a new data.frame to then pass to gam?

Thanks!!!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.



--
Corrado Topi
PhD Researcher
Global Climate Change and Biodiversity
Area 18,Department of Biology
University of York, York, YO10 5YW, UK
Phone: + 44 (0) 1904 328645, E-mail: ct...@york.ac.uk

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Factor variables with GAM models

2010-03-19 Thread Noah Silverman

I'm just starting to learn about GAM models.

When using the lm function in R, any factors I have in my data set are 
automatically converted into a series of binomial variables.


For example, if I have a data.frame with a column named color and values 
red, green, blue.   The lm function automatically replaces it with 
3 variables colorred, colorgreen, colorblue which are binomial {0,1}


When I use the gam function, R doesn't do this so I get an error.

1) Is there a way to ask the gam function to do this conversion for me?
2) If not, is there some other tool or utility to make this data 
transformation easy?
3) Last option - can I use lm to transform the data and then extract it 
into a new data.frame to then pass to gam?


Thanks!!!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Factor variables with GAM models

2010-03-19 Thread Steven McKinney
Hi Noah

GAM models were developed to assess the functional form
of the relationship of continuous predictor variables to the
response, so weren't really meant to handle factor variables
as predictor variables.  GAMs are of the form
E(Y | X1, X2, ...) = So + S(X1) + S(X2) + ...
where S(X) is a smooth function of X.

Hence you might want to rethink why you'd want a
factor variable as a predictor variable in a GAM.
This is why the gam machinery doesn't just do the
factor conversion to indicator variables as is done in
lm.

HTH

Steven McKinney


From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf Of 
Noah Silverman [n...@smartmediacorp.com]
Sent: March 19, 2010 12:54 PM
To: r-help@r-project.org
Subject: [R] Factor variables with GAM models

I'm just starting to learn about GAM models.

When using the lm function in R, any factors I have in my data set are
automatically converted into a series of binomial variables.

For example, if I have a data.frame with a column named color and values
red, green, blue.   The lm function automatically replaces it with
3 variables colorred, colorgreen, colorblue which are binomial {0,1}

When I use the gam function, R doesn't do this so I get an error.

1) Is there a way to ask the gam function to do this conversion for me?
2) If not, is there some other tool or utility to make this data
transformation easy?
3) Last option - can I use lm to transform the data and then extract it
into a new data.frame to then pass to gam?

Thanks!!!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Factor variables with GAM models

2010-03-19 Thread Noah Silverman

Steve,

I get that.  What you wrote make sense.

My challenge is the data I'm attempting to model.  Some of the variables 
are continuous, some are factors.  both linear and poisson models work. 
(Poisson doing a much more accurate job.)  However, some of the 
numerical variables are clearly non-linear.  Hence my interest in GAM.  
I suppose one alternative would be to try some polynomial transformation 
on the variable as part of a Poisson model.


Any other suggestions would be welcome.

Thanks!

-N

On 3/19/10 8:37 PM, Steven McKinney wrote:

Hi Noah

GAM models were developed to assess the functional form
of the relationship of continuous predictor variables to the
response, so weren't really meant to handle factor variables
as predictor variables.  GAMs are of the form
E(Y | X1, X2, ...) = So + S(X1) + S(X2) + ...
where S(X) is a smooth function of X.

Hence you might want to rethink why you'd want a
factor variable as a predictor variable in a GAM.
This is why the gam machinery doesn't just do the
factor conversion to indicator variables as is done in
lm.

HTH

Steven McKinney


From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf Of 
Noah Silverman [n...@smartmediacorp.com]
Sent: March 19, 2010 12:54 PM
To: r-help@r-project.org
Subject: [R] Factor variables with GAM models

I'm just starting to learn about GAM models.

When using the lm function in R, any factors I have in my data set are
automatically converted into a series of binomial variables.

For example, if I have a data.frame with a column named color and values
red, green, blue.   The lm function automatically replaces it with
3 variables colorred, colorgreen, colorblue which are binomial {0,1}

When I use the gam function, R doesn't do this so I get an error.

1) Is there a way to ask the gam function to do this conversion for me?
2) If not, is there some other tool or utility to make this data
transformation easy?
3) Last option - can I use lm to transform the data and then extract it
into a new data.frame to then pass to gam?

Thanks!!!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.