I do not have enough test data for regression analysis although I know there
are some statistical regression methods that can be used for small dataset.
That is why I need build a model firslty using training dataset.
Thanks,
Jim
--- On Mon, 30/1/12, Liaw, Andy andy_l...@merck.com wrote:
From: Liaw, Andy andy_l...@merck.com
Subject: RE: [R] Variable selection based on both training and testing data
To: 'Jin Minming' jminm...@yahoo.com, r-help@r-project.org
r-help@r-project.org
Date: Monday, 30 January, 2012, 13:39
Variable section is part of the
training process-- it chooses the model. By
definition, test data is used only for testing (evaluating
chosen model).
If you find a package or function that does variable
selection on test data, run from it!
Best,
Andy
-Original Message-
From: r-help-boun...@r-project.org
[mailto:r-help-boun...@r-project.org]
On Behalf Of Jin Minming
Sent: Monday, January 30, 2012 8:14 AM
To: r-help@r-project.org
Subject: [R] Variable selection based on both training
and
testing data
Dear all,
The variable selection in regression is usually
determined by
the training data using AIC or F value, such as
stepAIC. Is
there some R package that can consider both the
training and
test dataset? For example, I have two separate training
data
and test data. Firstly, a regression model is obtained
by
using training data, and then this model is tested by
using
test data. This process continues in order to find some
possible optimal models in terms of RMSE or R2 for both
training and test data.
Thanks,
Jim
__
R-help@r-project.org
mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained,
reproducible code.
Notice: This e-mail message, together with any
attachments, contains
information of Merck Co., Inc. (One Merck Drive,
Whitehouse Station,
New Jersey, USA 08889), and/or its affiliates Direct contact
information
for affiliates is available at
http://www.merck.com/contact/contacts.html) that may be
confidential,
proprietary copyrighted and/or legally privileged. It is
intended solely
for the use of the individual or entity named on this
message. If you are
not the intended recipient, and have received this message
in error,
please notify us immediately by reply e-mail and then delete
it from
your system.
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.