[R] Logistic Regression in R (SAS -like output)
Hello useRs, I have a problem at hand which I'd think is fairly common amongst groups were R is being adopted for Analytics in place of SAS. Users would like to obtain results for logistic regression in R that they have become accustomed to in SAS. Towards this end, I was able to propose the Design package in R which contains many functions to extract the various metrics that SAS reports. If you have suggestions pertaining to other packages, or sample code that replicates some of the SAS outputs for logistic regression, I would be glad to hear of them. Some of the requirements are: - Stepwise variable selection for logistic regression - Choose base level for factor variables - The Hosmer-Lemeshow statistic - concordant and discordant - Tau C statistic Thank you for your suggestions. Regards, Harsh Singhal __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Logistic Regression in R (SAS -like output)
Frank E Harrell Jr Professor and ChairmanSchool of Medicine Department of Biostatistics Vanderbilt University On Mon, 9 Aug 2010, Harsh wrote: Hello useRs, I have a problem at hand which I'd think is fairly common amongst groups were R is being adopted for Analytics in place of SAS. Users would like to obtain results for logistic regression in R that they have become accustomed to in SAS. Towards this end, I was able to propose the Design package in R which contains many functions to extract the various metrics that SAS reports. The replacement for Design, rms, has some new indexes. If you have suggestions pertaining to other packages, or sample code that replicates some of the SAS outputs for logistic regression, I would be glad to hear of them. Some of the requirements are: - Stepwise variable selection for logistic regression an invalid procedure - Choose base level for factor variables not relevant - get what you need from predicted values and differences in predicted values or contrasts - this automatically takes care of reference cells - The Hosmer-Lemeshow statistic obsolete: low power and sensitive to choice of binning - concordant and discordant see Hmisc's rcorr.cens - Tau C statistic Those are two different statistics. tau and C are obtained by lrm in rms/Design. Frank Thank you for your suggestions. Regards, Harsh Singhal __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Logistic Regression in R (SAS -like output)
On Mon, Aug 9, 2010 at 6:43 AM, Harsh singhal...@gmail.com wrote: Hello useRs, I have a problem at hand which I'd think is fairly common amongst groups were R is being adopted for Analytics in place of SAS. Users would like to obtain results for logistic regression in R that they have become accustomed to in SAS. Towards this end, I was able to propose the Design package in R which contains many functions to extract the various metrics that SAS reports. If you have suggestions pertaining to other packages, or sample code that replicates some of the SAS outputs for logistic regression, I would be glad to hear of them. Some of the requirements are: - Stepwise variable selection for logistic regression - Choose base level for factor variables - The Hosmer-Lemeshow statistic - concordant and discordant - Tau C statistic For stepwise logistic regression using AIC see: library(MASS) ?stepAIC For specifying reference level: ?relevel __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Logistic Regression in R (SAS -like output)
Note that stepwise variale selection based on AIC has all the problems of stepwise variable selection based on P-values. AIC is just a restatement of the P-Value. Frank Frank E Harrell Jr Professor and ChairmanSchool of Medicine Department of Biostatistics Vanderbilt University On Mon, 9 Aug 2010, Gabor Grothendieck wrote: On Mon, Aug 9, 2010 at 6:43 AM, Harsh singhal...@gmail.com wrote: Hello useRs, I have a problem at hand which I'd think is fairly common amongst groups were R is being adopted for Analytics in place of SAS. Users would like to obtain results for logistic regression in R that they have become accustomed to in SAS. Towards this end, I was able to propose the Design package in R which contains many functions to extract the various metrics that SAS reports. If you have suggestions pertaining to other packages, or sample code that replicates some of the SAS outputs for logistic regression, I would be glad to hear of them. Some of the requirements are: - Stepwise variable selection for logistic regression - Choose base level for factor variables - The Hosmer-Lemeshow statistic - concordant and discordant - Tau C statistic For stepwise logistic regression using AIC see: library(MASS) ?stepAIC For specifying reference level: ?relevel __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Logistic Regression in R (SAS -like output)
In the trivial case where all candidate predictors have one degree of freedom (which is unlikely as some things will be nonlinear or have 2 categories), adding a variable if it increases AIC is the same as adding it if its chi-square exceeds 2. This corresponds to an alpha level of 0.157 for a chi-square with 1 d.f. At least AIC leads people to use a more realistic alpha (small alpha in stepwise regression leads to more bias in the retained regression coefficients). But you still have serious multiplicity problems, and non-replicable models. Things are different if you have a pre-defined group of variables you are thinking of adding. Suppose that this group of 10 variables required 15 d.f. Adding the group if AIC (based on 15 d.f.) increases wouldn't be a bad strategy. This avoids the multiplicities of single-variable looks. Frank Frank E Harrell Jr Professor and ChairmanSchool of Medicine Department of Biostatistics Vanderbilt University On Mon, 9 Aug 2010, Kingsford Jones wrote: On Mon, Aug 9, 2010 at 10:27 AM, Frank Harrell f.harr...@vanderbilt.edu wrote: Note that stepwise variale selection based on AIC has all the problems of stepwise variable selection based on P-values. AIC is just a restatement of the P-Value. I find the above statement very interesting, particularly because there are common misconceptions in the ecological community that AIC is a panacea for model selection problems and the theory behind P-values is deeply flawed. Can you direct me toward a reference for better understanding the relation? best, Kingsford Jones Frank Frank E Harrell Jr Professor and Chairman School of Medicine Department of Biostatistics Vanderbilt University On Mon, 9 Aug 2010, Gabor Grothendieck wrote: On Mon, Aug 9, 2010 at 6:43 AM, Harsh singhal...@gmail.com wrote: Hello useRs, I have a problem at hand which I'd think is fairly common amongst groups were R is being adopted for Analytics in place of SAS. Users would like to obtain results for logistic regression in R that they have become accustomed to in SAS. Towards this end, I was able to propose the Design package in R which contains many functions to extract the various metrics that SAS reports. If you have suggestions pertaining to other packages, or sample code that replicates some of the SAS outputs for logistic regression, I would be glad to hear of them. Some of the requirements are: - Stepwise variable selection for logistic regression - Choose base level for factor variables - The Hosmer-Lemeshow statistic - concordant and discordant - Tau C statistic For stepwise logistic regression using AIC see: library(MASS) ?stepAIC For specifying reference level: ?relevel __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Logistic regression and R
Hi there, I take advantage of this chat to ask other question related to logistic regression. This is my first time as well. I have data that I want to model but Im not sure if glm() is the correct function to use. My problem is as follow, I used Oxford Instability Score of the shoulder (OIS, independent variable) as indicative of the outcome ( 12-20 best, to 41-60 worst outcome, 5 possible results). Looking for many independent variables like categorical and numerical I want to see their prognostic impact on the outcome. There is a function which I can use to model my problem? I heard about multinomial logistic regression but I did not able to find nothing related to it. Any help would be much appreciated. Marlene. 2009/7/31 G. Jay Kerns gke...@ysu.edu Dear Carlos, On Thu, Jul 30, 2009 at 6:11 PM, Carlos Lópeznato...@fisica.unam.mx wrote: Hello everybody :-) I have some data that I want to model with a logistic regression, most of the independent variables are numeric and the only dependent is categorical, I was thinking that I could apply a logistic regression using glm but I wanted to deepen my knowledge of this so I tried to do some reading and found the iris dataset, now I would like to ask two things, first if you know of any bibliography to read more about the logistic regression and R so I could understand and interpret better the output, See the following https://home.comcast.net/~lthompson221/https://home.comcast.net/%7Elthompson221/ and the following specific link on that page: https://home.comcast.net/~lthompson221/Splusdiscrete2.pdfhttps://home.comcast.net/%7Elthompson221/Splusdiscrete2.pdf which is a manual to accompany Agresti's _Categorical Data Analysis_. In particular, you may want to check out Chapter 5 (and also some of 4). and second, what could I do when I have some independent variables that are not only numerical but categorical too, i.e. mixed (categorical and numerical), can I still use a logistic regression? Easy peasy, lemon squeezy. See page 78. Hope this helps, Jay *** G. Jay Kerns, Ph.D. Associate Professor Department of Mathematics Statistics Youngstown State University Youngstown, OH 44555-0002 USA Office: 1035 Cushwa Hall Phone: (330) 941-3310 Office (voice mail) -3302 Department -3170 FAX VoIP: gjke...@ekiga.net E-mail: gke...@ysu.edu http://www.cc.ysu.edu/~gjkerns/ http://www.cc.ysu.edu/%7Egjkerns/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Logistic regression and R
Dear Marlene, Have a look at the polr() function in the MASS package. HTH, Thierry ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, methodology and quality assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 thierry.onkel...@inbo.be www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey -Oorspronkelijk bericht- Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Namens marlene marchena Verzonden: vrijdag 31 juli 2009 12:04 Aan: r-help Onderwerp: Re: [R] Logistic regression and R Hi there, I take advantage of this chat to ask other question related to logistic regression. This is my first time as well. I have data that I want to model but I'm not sure if glm() is the correct function to use. My problem is as follow, I used Oxford Instability Score of the shoulder (OIS, independent variable) as indicative of the outcome ( 12-20 best, to 41-60 worst outcome, 5 possible results). Looking for many independent variables like categorical and numerical I want to see their prognostic impact on the outcome. There is a function which I can use to model my problem? I heard about multinomial logistic regression but I did not able to find nothing related to it. Any help would be much appreciated. Marlene. 2009/7/31 G. Jay Kerns gke...@ysu.edu Dear Carlos, On Thu, Jul 30, 2009 at 6:11 PM, Carlos Lópeznato...@fisica.unam.mx wrote: Hello everybody :-) I have some data that I want to model with a logistic regression, most of the independent variables are numeric and the only dependent is categorical, I was thinking that I could apply a logistic regression using glm but I wanted to deepen my knowledge of this so I tried to do some reading and found the iris dataset, now I would like to ask two things, first if you know of any bibliography to read more about the logistic regression and R so I could understand and interpret better the output, See the following https://home.comcast.net/~lthompson221/https://home.comcast.net/%7Elt hompson221/ and the following specific link on that page: https://home.comcast.net/~lthompson221/Splusdiscrete2.pdfhttps://home .comcast.net/%7Elthompson221/Splusdiscrete2.pdf which is a manual to accompany Agresti's _Categorical Data Analysis_. In particular, you may want to check out Chapter 5 (and also some of 4). and second, what could I do when I have some independent variables that are not only numerical but categorical too, i.e. mixed (categorical and numerical), can I still use a logistic regression? Easy peasy, lemon squeezy. See page 78. Hope this helps, Jay *** G. Jay Kerns, Ph.D. Associate Professor Department of Mathematics Statistics Youngstown State University Youngstown, OH 44555-0002 USA Office: 1035 Cushwa Hall Phone: (330) 941-3310 Office (voice mail) -3302 Department -3170 FAX VoIP: gjke...@ekiga.net E-mail: gke...@ysu.edu http://www.cc.ysu.edu/~gjkerns/ http://www.cc.ysu.edu/%7Egjkerns/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is door een geldig ondertekend document. The views expressed in this message and any annex are purely those of the writer and may not be regarded as stating an official position of INBO, as long as the message is not confirmed by a duly signed document. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Logistic regression and R
Hello everybody :-) I have some data that I want to model with a logistic regression, most of the independent variables are numeric and the only dependent is categorical, I was thinking that I could apply a logistic regression using glm but I wanted to deepen my knowledge of this so I tried to do some reading and found the iris dataset, now I would like to ask two things, first if you know of any bibliography to read more about the logistic regression and R so I could understand and interpret better the output, and second, what could I do when I have some independent variables that are not only numerical but categorical too, i.e. mixed (categorical and numerical), can I still use a logistic regression? Thank you very much!!! :-D -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Logistic regression and R
Dear Carlos, On Thu, Jul 30, 2009 at 6:11 PM, Carlos Lópeznato...@fisica.unam.mx wrote: Hello everybody :-) I have some data that I want to model with a logistic regression, most of the independent variables are numeric and the only dependent is categorical, I was thinking that I could apply a logistic regression using glm but I wanted to deepen my knowledge of this so I tried to do some reading and found the iris dataset, now I would like to ask two things, first if you know of any bibliography to read more about the logistic regression and R so I could understand and interpret better the output, See the following https://home.comcast.net/~lthompson221/ and the following specific link on that page: https://home.comcast.net/~lthompson221/Splusdiscrete2.pdf which is a manual to accompany Agresti's _Categorical Data Analysis_. In particular, you may want to check out Chapter 5 (and also some of 4). and second, what could I do when I have some independent variables that are not only numerical but categorical too, i.e. mixed (categorical and numerical), can I still use a logistic regression? Easy peasy, lemon squeezy. See page 78. Hope this helps, Jay *** G. Jay Kerns, Ph.D. Associate Professor Department of Mathematics Statistics Youngstown State University Youngstown, OH 44555-0002 USA Office: 1035 Cushwa Hall Phone: (330) 941-3310 Office (voice mail) -3302 Department -3170 FAX VoIP: gjke...@ekiga.net E-mail: gke...@ysu.edu http://www.cc.ysu.edu/~gjkerns/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.