Re: [R] selecting rows for inclusion in lm
Why not use the subset option? Something like: lm(diff ~ Age + Race, data=data, subset=data$Meno==PRE) should do the trick, and be much easier to read! On 18/01/07, John Sorkin [EMAIL PROTECTED] wrote: I am having trouble selecting rows of a dataframe that will be included in a regression. I am trying to select those rows for which the variable Meno equals PRE. I have used the code below: difffitPre-lm(data[,diff]~data[,Age]+data[,Race],data=data[data[,Meno]==PRE,]) summary(difffitPre) The output from the summary indicates that more than 76 rows are included in the regression: Residual standard error: 2.828 on 76 degrees of freedom where in fact only 22 rows should be included as can be seen from the following: print(data[length(data[,Meno]==PRE,Meno])) [1] 22 I would appreciate any help in modifying the data= parameter of the lm so that I include only those subjects for which Meno=PRE. R 2.3.1 Windows XP Thanks, John John Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics Baltimore VA Medical Center GRECC, University of Maryland School of Medicine Claude D. Pepper OAIC, University of Maryland Clinical Nutrition Research Unit, and Baltimore VA Center Stroke of Excellence University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) [EMAIL PROTECTED] Confidentiality Statement: This email message, including any attachments, is for the so...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- = David Barron Said Business School University of Oxford Park End Street Oxford OX1 1HP __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] selecting rows for inclusion in lm
Why not use the subset option? Something like: lm(diff ~ Age + Race, data=data, subset=data$Meno==PRE) should do the trick, and be much easier to read! data$ could be omitted, simply lm(diff ~ Age + Race, data=data, subset=Meno==PRE) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] selecting rows for inclusion in lm
At 08:19 18/01/2007, David Barron wrote: Why not use the subset option? Something like: lm(diff ~ Age + Race, data=data, subset=data$Meno==PRE) should do the trick, and be much easier to read! And indeed the advice in library(fortunes) fortune(dog) Firstly, don't call your matrix 'matrix'. Would you call your dog 'dog'? Anyway, it might clash with the function 'matrix'. -- Barry Rowlingson R-help (October 2004) also helps to make life clearer I find On 18/01/07, John Sorkin [EMAIL PROTECTED] wrote: I am having trouble selecting rows of a dataframe that will be included in a regression. I am trying to select those rows for which the variable Meno equals PRE. I have used the code below: difffitPre-lm(data[,diff]~data[,Age]+data[,Race],data=data[data[,Meno]==PRE,]) summary(difffitPre) The output from the summary indicates that more than 76 rows are included in the regression: Residual standard error: 2.828 on 76 degrees of freedom where in fact only 22 rows should be included as can be seen from the following: print(data[length(data[,Meno]==PRE,Meno])) [1] 22 I would appreciate any help in modifying the data= parameter of the lm so that I include only those subjects for which Meno=PRE. R 2.3.1 Windows XP Thanks, John John Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics Baltimore VA Medical Center GRECC, University of Maryland School of Medicine Claude D. Pepper OAIC, University of Maryland Clinical Nutrition Research Unit, and Baltimore VA Center Stroke of Excellence University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) [EMAIL PROTECTED] Confidentiality Statement: This email message, including any attachments, is for the so...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- = David Barron Said Business School University of Oxford Park End Street Oxford OX1 1HP Michael Dewey http://www.aghmed.fsnet.co.uk __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] selecting rows for inclusion in lm
I must express thanks to Peter Konings, Gary Collins, David Barron, Prof. Brian Ripley, Vladimir Eremeev, and Michael Dewey (I hope I did not leave anyone out) all of whom suggested I used the subset parameter of lm to restrict the subjects included in my lm. R is a special programming language and statistics package, both because of the wonderful features of R (thank you R developers), but equally importantly because of the community of people who willingly give of there time and knowledge to help other users. Many thanks. If any R developers are out there, may I suggest that the help page for lm include more information (perhaps an example) on how one uses the subset option. The current documentation states: subsetan optional vector specifying a subset of observations to be used in the fitting process. Although I read the help page, I could not get subset to work until the kind people mentioned above sent me examples. Again, many thanks to one and all! John John Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics Baltimore VA Medical Center GRECC, University of Maryland School of Medicine Claude D. Pepper OAIC, University of Maryland Clinical Nutrition Research Unit, and Baltimore VA Center Stroke of Excellence University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) [EMAIL PROTECTED] Prof Brian Ripley [EMAIL PROTECTED] 1/18/2007 3:38 AM On Thu, 18 Jan 2007, David Barron wrote: Why not use the subset option? Something like: lm(diff ~ Age + Race, data=data, subset=data$Meno==PRE) should do the trick, and be much easier to read! And lm(diff ~ Age + Race, data = data, subset = (Meno==PRE)) would be easier still. On 18/01/07, John Sorkin [EMAIL PROTECTED] wrote: I am having trouble selecting rows of a dataframe that will be included in a regression. I am trying to select those rows for which the variable Meno equals PRE. I have used the code below: difffitPre-lm(data[,diff]~data[,Age]+data[,Race],data=data[data[,Meno]==PRE,]) You are missing a comma in data = data[..., ] summary(difffitPre) The output from the summary indicates that more than 76 rows are included in the regression: Residual standard error: 2.828 on 76 degrees of freedom where in fact only 22 rows should be included as can be seen from the following: print(data[length(data[,Meno]==PRE,Meno])) [1] 22 I would appreciate any help in modifying the data= parameter of the lm so that I include only those subjects for which Meno=PRE. R 2.3.1 Windows XP Thanks, John John Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics Baltimore VA Medical Center GRECC, University of Maryland School of Medicine Claude D. Pepper OAIC, University of Maryland Clinical Nutrition Research Unit, and Baltimore VA Center Stroke of Excellence University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) [EMAIL PROTECTED] Confidentiality Statement: This email message, including any attachments, is for the so...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R ( http://www.r/ )-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 Confidentiality Statement: This email message, including any attachments, is for the so...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] selecting rows for inclusion in lm
I am having trouble selecting rows of a dataframe that will be included in a regression. I am trying to select those rows for which the variable Meno equals PRE. I have used the code below: difffitPre-lm(data[,diff]~data[,Age]+data[,Race],data=data[data[,Meno]==PRE,]) summary(difffitPre) The output from the summary indicates that more than 76 rows are included in the regression: Residual standard error: 2.828 on 76 degrees of freedom where in fact only 22 rows should be included as can be seen from the following: print(data[length(data[,Meno]==PRE,Meno])) [1] 22 I would appreciate any help in modifying the data= parameter of the lm so that I include only those subjects for which Meno=PRE. R 2.3.1 Windows XP Thanks, John John Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics Baltimore VA Medical Center GRECC, University of Maryland School of Medicine Claude D. Pepper OAIC, University of Maryland Clinical Nutrition Research Unit, and Baltimore VA Center Stroke of Excellence University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) [EMAIL PROTECTED] Confidentiality Statement: This email message, including any attachments, is for the so...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.