[R] Logistic Regression in R (SAS -like output)

2010-08-09 Thread Harsh
Hello useRs,

I have a problem at hand which I'd think is fairly common amongst
groups were R is being adopted for Analytics in place of SAS.
Users would like to obtain results for logistic regression in R that
they have become accustomed to in SAS.

Towards this end, I was able to propose the Design package in R which
contains many functions to extract the various metrics that SAS
reports.

If you have suggestions pertaining to other packages, or sample code
that replicates some of the SAS outputs for logistic regression, I
would be glad to hear of them.

Some of the requirements are:
- Stepwise variable selection for logistic regression
- Choose base level for factor variables
- The Hosmer-Lemeshow statistic
- concordant and discordant
- Tau C statistic

Thank you for your suggestions.
Regards,
Harsh Singhal

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Logistic Regression in R (SAS -like output)

2010-08-09 Thread Frank Harrell



Frank E Harrell Jr   Professor and ChairmanSchool of Medicine
 Department of Biostatistics   Vanderbilt University

On Mon, 9 Aug 2010, Harsh wrote:


Hello useRs,

I have a problem at hand which I'd think is fairly common amongst
groups were R is being adopted for Analytics in place of SAS.
Users would like to obtain results for logistic regression in R that
they have become accustomed to in SAS.

Towards this end, I was able to propose the Design package in R which
contains many functions to extract the various metrics that SAS
reports.


The replacement for Design, rms, has some new indexes.



If you have suggestions pertaining to other packages, or sample code
that replicates some of the SAS outputs for logistic regression, I
would be glad to hear of them.

Some of the requirements are:
- Stepwise variable selection for logistic regression


 an invalid procedure


- Choose base level for factor variables


 not relevant - get what you need from predicted values and 
differences in predicted values or contrasts - this automatically 
takes care of reference cells


  - The Hosmer-Lemeshow statistic

 obsolete: low power and sensitive to choice of binning


- concordant and discordant


see Hmisc's rcorr.cens


- Tau C statistic


Those are two different statistics.  tau and C are obtained by lrm in 
rms/Design.


Frank



Thank you for your suggestions.
Regards,
Harsh Singhal

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Logistic Regression in R (SAS -like output)

2010-08-09 Thread Gabor Grothendieck
On Mon, Aug 9, 2010 at 6:43 AM, Harsh singhal...@gmail.com wrote:
 Hello useRs,

 I have a problem at hand which I'd think is fairly common amongst
 groups were R is being adopted for Analytics in place of SAS.
 Users would like to obtain results for logistic regression in R that
 they have become accustomed to in SAS.

 Towards this end, I was able to propose the Design package in R which
 contains many functions to extract the various metrics that SAS
 reports.

 If you have suggestions pertaining to other packages, or sample code
 that replicates some of the SAS outputs for logistic regression, I
 would be glad to hear of them.

 Some of the requirements are:
 - Stepwise variable selection for logistic regression
 - Choose base level for factor variables
 - The Hosmer-Lemeshow statistic
 - concordant and discordant
 - Tau C statistic


For stepwise logistic regression using AIC see:

library(MASS)
?stepAIC

For specifying reference level:

?relevel

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Logistic Regression in R (SAS -like output)

2010-08-09 Thread Frank Harrell


Note that stepwise variale selection based on AIC has all the problems 
of stepwise variable selection based on P-values.  AIC is just a 
restatement of the P-Value.


Frank

Frank E Harrell Jr   Professor and ChairmanSchool of Medicine
 Department of Biostatistics   Vanderbilt University

On Mon, 9 Aug 2010, Gabor Grothendieck wrote:


On Mon, Aug 9, 2010 at 6:43 AM, Harsh singhal...@gmail.com wrote:

Hello useRs,

I have a problem at hand which I'd think is fairly common amongst
groups were R is being adopted for Analytics in place of SAS.
Users would like to obtain results for logistic regression in R that
they have become accustomed to in SAS.

Towards this end, I was able to propose the Design package in R which
contains many functions to extract the various metrics that SAS
reports.

If you have suggestions pertaining to other packages, or sample code
that replicates some of the SAS outputs for logistic regression, I
would be glad to hear of them.

Some of the requirements are:
- Stepwise variable selection for logistic regression
- Choose base level for factor variables
- The Hosmer-Lemeshow statistic
- concordant and discordant
- Tau C statistic



For stepwise logistic regression using AIC see:

library(MASS)
?stepAIC

For specifying reference level:

?relevel

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Logistic Regression in R (SAS -like output)

2010-08-09 Thread Frank Harrell


In the trivial case where all candidate predictors have one degree of 
freedom (which is unlikely as some things will be nonlinear or have  
2 categories), adding a variable if it increases AIC is the same as 
adding it if its chi-square exceeds 2.  This corresponds to an alpha 
level of 0.157 for a chi-square with 1 d.f.   At least AIC leads 
people to use a more realistic alpha (small alpha in stepwise 
regression leads to more bias in the retained regression 
coefficients).  But you still have serious multiplicity problems, and 
non-replicable models.


Things are different if you have a pre-defined group of variables you 
are thinking of adding.  Suppose that this group of 10 variables 
required 15 d.f.   Adding the group if AIC (based on 15 d.f.) 
increases wouldn't be a bad strategy.  This avoids the multiplicities 
of single-variable looks.


Frank

Frank E Harrell Jr   Professor and ChairmanSchool of Medicine
 Department of Biostatistics   Vanderbilt University

On Mon, 9 Aug 2010, Kingsford Jones wrote:


On Mon, Aug 9, 2010 at 10:27 AM, Frank Harrell f.harr...@vanderbilt.edu wrote:


Note that stepwise variale selection based on AIC has all the problems of
stepwise variable selection based on P-values.  AIC is just a restatement of
the P-Value.


I find the above statement very interesting, particularly because
there are common misconceptions in the ecological community that AIC
is a panacea for model selection problems and the theory behind
P-values is deeply flawed.  Can you direct me toward a reference for
better understanding the relation?

best,

Kingsford Jones




Frank

Frank E Harrell Jr   Professor and Chairman        School of Medicine
                    Department of Biostatistics   Vanderbilt University

On Mon, 9 Aug 2010, Gabor Grothendieck wrote:


On Mon, Aug 9, 2010 at 6:43 AM, Harsh singhal...@gmail.com wrote:


Hello useRs,

I have a problem at hand which I'd think is fairly common amongst
groups were R is being adopted for Analytics in place of SAS.
Users would like to obtain results for logistic regression in R that
they have become accustomed to in SAS.

Towards this end, I was able to propose the Design package in R which
contains many functions to extract the various metrics that SAS
reports.

If you have suggestions pertaining to other packages, or sample code
that replicates some of the SAS outputs for logistic regression, I
would be glad to hear of them.

Some of the requirements are:
- Stepwise variable selection for logistic regression
- Choose base level for factor variables
- The Hosmer-Lemeshow statistic
- concordant and discordant
- Tau C statistic



For stepwise logistic regression using AIC see:

library(MASS)
?stepAIC

For specifying reference level:

?relevel

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Logistic regression and R

2009-07-31 Thread marlene marchena
Hi there,



I take advantage of this chat to ask other question related to logistic
regression. This is my first time as well.

I have data that I want to model but I’m not sure if glm() is the correct
function to use. My problem is as follow, I used Oxford Instability Score of
the shoulder (OIS, independent variable) as indicative of the outcome (
12-20 best, to 41-60 worst outcome, 5 possible results). Looking for many
independent variables like categorical and numerical I want to see their
prognostic impact on the outcome.

There is a function which I can use to model my problem? I heard about
multinomial logistic regression but I did not able to find nothing related
to it.



Any help would be much appreciated.


Marlene.

2009/7/31 G. Jay Kerns gke...@ysu.edu

 Dear Carlos,

 On Thu, Jul 30, 2009 at 6:11 PM, Carlos Lópeznato...@fisica.unam.mx
 wrote:
  Hello everybody :-)
 
  I have some data that I want to model with a logistic regression, most of
  the independent variables are numeric and the only dependent is
 categorical,
  I was thinking that I could apply a logistic regression using glm but I
  wanted to deepen my knowledge of this so I tried to do some reading and
  found the iris dataset, now I would like to ask two things, first if
 you
  know of any bibliography to read more about the logistic regression and R
 so
  I could understand and interpret better the output,


 See the following

 https://home.comcast.net/~lthompson221/https://home.comcast.net/%7Elthompson221/

 and the following specific link on that page:

 https://home.comcast.net/~lthompson221/Splusdiscrete2.pdfhttps://home.comcast.net/%7Elthompson221/Splusdiscrete2.pdf

 which is a manual to accompany Agresti's _Categorical Data Analysis_.
 In particular, you may want to check out Chapter 5 (and also some of
 4).


 and second, what could I
  do when I have some independent variables that are not only numerical but
  categorical too, i.e. mixed (categorical and numerical), can I still use
 a
  logistic regression?

 Easy peasy, lemon squeezy.  See page 78.

 Hope this helps,
 Jay












 ***
 G. Jay Kerns, Ph.D.
 Associate Professor
 Department of Mathematics  Statistics
 Youngstown State University
 Youngstown, OH 44555-0002 USA
 Office: 1035 Cushwa Hall
 Phone: (330) 941-3310 Office (voice mail)
 -3302 Department
 -3170 FAX
 VoIP: gjke...@ekiga.net
 E-mail: gke...@ysu.edu
 http://www.cc.ysu.edu/~gjkerns/ http://www.cc.ysu.edu/%7Egjkerns/

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Logistic regression and R

2009-07-31 Thread ONKELINX, Thierry
Dear Marlene,

Have a look at the polr() function in the MASS package.

HTH,

Thierry
 



ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature and 
Forest
Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, methodology 
and quality assurance
Gaverstraat 4
9500 Geraardsbergen
Belgium
tel. + 32 54/436 185
thierry.onkel...@inbo.be
www.inbo.be

To call in the statistician after the experiment is done may be no more than 
asking him to perform a post-mortem examination: he may be able to say what the 
experiment died of.
~ Sir Ronald Aylmer Fisher

The plural of anecdote is not data.
~ Roger Brinner

The combination of some data and an aching desire for an answer does not ensure 
that a reasonable answer can be extracted from a given body of data.
~ John Tukey

-Oorspronkelijk bericht-
Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Namens 
marlene marchena
Verzonden: vrijdag 31 juli 2009 12:04
Aan: r-help
Onderwerp: Re: [R] Logistic regression and R

Hi there,



I take advantage of this chat to ask other question related to logistic 
regression. This is my first time as well.

I have data that I want to model but I'm not sure if glm() is the correct 
function to use. My problem is as follow, I used Oxford Instability Score of 
the shoulder (OIS, independent variable) as indicative of the outcome ( 12-20 
best, to 41-60 worst outcome, 5 possible results). Looking for many independent 
variables like categorical and numerical I want to see their prognostic impact 
on the outcome.

There is a function which I can use to model my problem? I heard about 
multinomial logistic regression but I did not able to find nothing related to 
it.



Any help would be much appreciated.


Marlene.

2009/7/31 G. Jay Kerns gke...@ysu.edu

 Dear Carlos,

 On Thu, Jul 30, 2009 at 6:11 PM, Carlos Lópeznato...@fisica.unam.mx
 wrote:
  Hello everybody :-)
 
  I have some data that I want to model with a logistic regression, 
  most of the independent variables are numeric and the only dependent 
  is
 categorical,
  I was thinking that I could apply a logistic regression using glm 
  but I wanted to deepen my knowledge of this so I tried to do some 
  reading and found the iris dataset, now I would like to ask two 
  things, first if
 you
  know of any bibliography to read more about the logistic regression 
  and R
 so
  I could understand and interpret better the output,


 See the following

 https://home.comcast.net/~lthompson221/https://home.comcast.net/%7Elt
 hompson221/

 and the following specific link on that page:

 https://home.comcast.net/~lthompson221/Splusdiscrete2.pdfhttps://home
 .comcast.net/%7Elthompson221/Splusdiscrete2.pdf

 which is a manual to accompany Agresti's _Categorical Data Analysis_.
 In particular, you may want to check out Chapter 5 (and also some of 
 4).


 and second, what could I
  do when I have some independent variables that are not only 
 numerical but  categorical too, i.e. mixed (categorical and 
 numerical), can I still use
 a
  logistic regression?

 Easy peasy, lemon squeezy.  See page 78.

 Hope this helps,
 Jay












 ***
 G. Jay Kerns, Ph.D.
 Associate Professor
 Department of Mathematics  Statistics Youngstown State University 
 Youngstown, OH 44555-0002 USA
 Office: 1035 Cushwa Hall
 Phone: (330) 941-3310 Office (voice mail)
 -3302 Department
 -3170 FAX
 VoIP: gjke...@ekiga.net
 E-mail: gke...@ysu.edu
 http://www.cc.ysu.edu/~gjkerns/ http://www.cc.ysu.edu/%7Egjkerns/

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]


Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer 
en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is
door een geldig ondertekend document. The views expressed in  this message 
and any annex are purely those of the writer and may not be regarded as stating 
an official position of INBO, as long as the message is not confirmed by a duly 
signed document.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Logistic regression and R

2009-07-30 Thread Carlos López

Hello everybody :-)

I have some data that I want to model with a logistic regression, most 
of the independent variables are numeric and the only dependent is 
categorical, I was thinking that I could apply a logistic regression 
using glm but I wanted to deepen my knowledge of this so I tried to do 
some reading and found the iris dataset, now I would like to ask two 
things, first if you know of any bibliography to read more about the 
logistic regression and R so I could understand and interpret better the 
output, and second, what could I do when I have some independent 
variables that are not only numerical but categorical too, i.e. mixed 
(categorical and numerical), can I still use a logistic regression?


Thank you very much!!! :-D

--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Logistic regression and R

2009-07-30 Thread G. Jay Kerns
Dear Carlos,

On Thu, Jul 30, 2009 at 6:11 PM, Carlos Lópeznato...@fisica.unam.mx wrote:
 Hello everybody :-)

 I have some data that I want to model with a logistic regression, most of
 the independent variables are numeric and the only dependent is categorical,
 I was thinking that I could apply a logistic regression using glm but I
 wanted to deepen my knowledge of this so I tried to do some reading and
 found the iris dataset, now I would like to ask two things, first if you
 know of any bibliography to read more about the logistic regression and R so
 I could understand and interpret better the output,


See the following

https://home.comcast.net/~lthompson221/

and the following specific link on that page:

https://home.comcast.net/~lthompson221/Splusdiscrete2.pdf

which is a manual to accompany Agresti's _Categorical Data Analysis_.
In particular, you may want to check out Chapter 5 (and also some of
4).


and second, what could I
 do when I have some independent variables that are not only numerical but
 categorical too, i.e. mixed (categorical and numerical), can I still use a
 logistic regression?

Easy peasy, lemon squeezy.  See page 78.

Hope this helps,
Jay












***
G. Jay Kerns, Ph.D.
Associate Professor
Department of Mathematics  Statistics
Youngstown State University
Youngstown, OH 44555-0002 USA
Office: 1035 Cushwa Hall
Phone: (330) 941-3310 Office (voice mail)
-3302 Department
-3170 FAX
VoIP: gjke...@ekiga.net
E-mail: gke...@ysu.edu
http://www.cc.ysu.edu/~gjkerns/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.