Re: [R] selecting rows for inclusion in lm

2007-01-18 Thread David Barron
Why not use the subset option?  Something like:

lm(diff ~ Age + Race, data=data, subset=data$Meno==PRE)

should do the trick, and be much easier to read!

On 18/01/07, John Sorkin [EMAIL PROTECTED] wrote:
 I am having trouble selecting rows of a dataframe that will be included
 in a regression. I am trying to select those rows for which the variable
 Meno equals PRE. I have used the code below:

 difffitPre-lm(data[,diff]~data[,Age]+data[,Race],data=data[data[,Meno]==PRE,])
 summary(difffitPre)

 The output from the summary indicates that more than 76 rows are
 included in the regression:

 Residual standard error: 2.828 on 76 degrees of freedom

 where in fact only 22 rows should be included as can be seen from the
 following:

 print(data[length(data[,Meno]==PRE,Meno]))
 [1] 22

 I would appreciate any help in modifying the data= parameter of the lm
 so that I include only those subjects for which Meno=PRE.

 R 2.3.1
 Windows XP

 Thanks,
 John

 John Sorkin M.D., Ph.D.
 Chief, Biostatistics and Informatics
 Baltimore VA Medical Center GRECC,
 University of Maryland School of Medicine Claude D. Pepper OAIC,
 University of Maryland Clinical Nutrition Research Unit, and
 Baltimore VA Center Stroke of Excellence

 University of Maryland School of Medicine
 Division of Gerontology
 Baltimore VA Medical Center
 10 North Greene Street
 GRECC (BT/18/GR)
 Baltimore, MD 21201-1524

 (Phone) 410-605-7119
 (Fax) 410-605-7913 (Please call phone number above prior to faxing)
 [EMAIL PROTECTED]

 Confidentiality Statement:
 This email message, including any attachments, is for the so...{{dropped}}

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
=
David Barron
Said Business School
University of Oxford
Park End Street
Oxford OX1 1HP

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] selecting rows for inclusion in lm

2007-01-18 Thread Vladimir Eremeev
 Why not use the subset option?  Something like:
 
 lm(diff ~ Age + Race, data=data, subset=data$Meno==PRE)

 should do the trick, and be much easier to read!

data$ could be omitted, simply 
lm(diff ~ Age + Race, data=data, subset=Meno==PRE)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] selecting rows for inclusion in lm

2007-01-18 Thread Michael Dewey
At 08:19 18/01/2007, David Barron wrote:
Why not use the subset option?  Something like:

lm(diff ~ Age + Race, data=data, subset=data$Meno==PRE)

should do the trick, and be much easier to read!

And indeed the advice in
  library(fortunes)
  fortune(dog)

Firstly, don't call your matrix 'matrix'. Would you call your dog 'dog'?
Anyway, it might clash with the function 'matrix'.
-- Barry Rowlingson
   R-help (October 2004)

 
also helps to make life clearer I find


On 18/01/07, John Sorkin [EMAIL PROTECTED] wrote:
I am having trouble selecting rows of a dataframe that will be included
in a regression. I am trying to select those rows for which the variable
Meno equals PRE. I have used the code below:

difffitPre-lm(data[,diff]~data[,Age]+data[,Race],data=data[data[,Meno]==PRE,])
summary(difffitPre)

The output from the summary indicates that more than 76 rows are
included in the regression:

Residual standard error: 2.828 on 76 degrees of freedom

where in fact only 22 rows should be included as can be seen from the
following:

print(data[length(data[,Meno]==PRE,Meno]))
[1] 22

I would appreciate any help in modifying the data= parameter of the lm
so that I include only those subjects for which Meno=PRE.

R 2.3.1
Windows XP

Thanks,
John

John Sorkin M.D., Ph.D.
Chief, Biostatistics and Informatics
Baltimore VA Medical Center GRECC,
University of Maryland School of Medicine Claude D. Pepper OAIC,
University of Maryland Clinical Nutrition Research Unit, and
Baltimore VA Center Stroke of Excellence

University of Maryland School of Medicine
Division of Gerontology
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524

(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing)
[EMAIL PROTECTED]

Confidentiality Statement:
This email message, including any attachments, is for the so...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


--
=
David Barron
Said Business School
University of Oxford
Park End Street
Oxford OX1 1HP



Michael Dewey
http://www.aghmed.fsnet.co.uk

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] selecting rows for inclusion in lm

2007-01-18 Thread John Sorkin
I must express thanks to Peter Konings, Gary Collins, David Barron,
Prof. Brian Ripley, Vladimir Eremeev, and Michael Dewey (I hope I did
not leave anyone out) all of whom suggested I used the subset parameter
of lm to restrict the subjects included in my lm. R is a special
programming language and statistics package, both because of the
wonderful features of R (thank you R developers), but equally
importantly because of the community of people who willingly give of
there time and knowledge to help other users. Many thanks.   
If any R developers are out there, may I suggest that the help page for
lm include more information (perhaps an example) on how one uses the
subset option. The current documentation states:
 

subsetan optional vector specifying a subset of observations to be used
in the fitting process.
 
Although I read the help page, I could not get subset to work until the
kind people mentioned above sent me examples.
 
Again, many thanks to one and all!
 
John
 
 
 
John Sorkin M.D., Ph.D.
Chief, Biostatistics and Informatics
Baltimore VA Medical Center GRECC,
University of Maryland School of Medicine Claude D. Pepper OAIC,
University of Maryland Clinical Nutrition Research Unit, and
Baltimore VA Center Stroke of Excellence

University of Maryland School of Medicine
Division of Gerontology
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524

(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing)
[EMAIL PROTECTED] 

 Prof Brian Ripley [EMAIL PROTECTED] 1/18/2007 3:38 AM 
On Thu, 18 Jan 2007, David Barron wrote:

 Why not use the subset option?  Something like:

 lm(diff ~ Age + Race, data=data, subset=data$Meno==PRE)

 should do the trick, and be much easier to read!

And

lm(diff ~ Age + Race, data = data, subset = (Meno==PRE))

would be easier still.


 On 18/01/07, John Sorkin [EMAIL PROTECTED] wrote:
 I am having trouble selecting rows of a dataframe that will be
included
 in a regression. I am trying to select those rows for which the
variable
 Meno equals PRE. I have used the code below:


difffitPre-lm(data[,diff]~data[,Age]+data[,Race],data=data[data[,Meno]==PRE,])

You are missing a comma in data = data[..., ]

 summary(difffitPre)

 The output from the summary indicates that more than 76 rows are
 included in the regression:

 Residual standard error: 2.828 on 76 degrees of freedom

 where in fact only 22 rows should be included as can be seen from
the
 following:

 print(data[length(data[,Meno]==PRE,Meno]))
 [1] 22

 I would appreciate any help in modifying the data= parameter of the
lm
 so that I include only those subjects for which Meno=PRE.

 R 2.3.1
 Windows XP

 Thanks,
 John

 John Sorkin M.D., Ph.D.
 Chief, Biostatistics and Informatics
 Baltimore VA Medical Center GRECC,
 University of Maryland School of Medicine Claude D. Pepper OAIC,
 University of Maryland Clinical Nutrition Research Unit, and
 Baltimore VA Center Stroke of Excellence

 University of Maryland School of Medicine
 Division of Gerontology
 Baltimore VA Medical Center
 10 North Greene Street
 GRECC (BT/18/GR)
 Baltimore, MD 21201-1524

 (Phone) 410-605-7119
 (Fax) 410-605-7913 (Please call phone number above prior to faxing)
 [EMAIL PROTECTED] 

 Confidentiality Statement:
 This email message, including any attachments, is for the
so...{{dropped}}

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help 
 PLEASE do read the posting guide http://www.R ( http://www.r/
)-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





-- 
Brian D. Ripley,  [EMAIL PROTECTED] 
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/ 
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

Confidentiality Statement:
This email message, including any attachments, is for the so...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] selecting rows for inclusion in lm

2007-01-17 Thread John Sorkin
I am having trouble selecting rows of a dataframe that will be included
in a regression. I am trying to select those rows for which the variable
Meno equals PRE. I have used the code below:

difffitPre-lm(data[,diff]~data[,Age]+data[,Race],data=data[data[,Meno]==PRE,])
summary(difffitPre)

The output from the summary indicates that more than 76 rows are
included in the regression:

Residual standard error: 2.828 on 76 degrees of freedom

where in fact only 22 rows should be included as can be seen from the
following:

print(data[length(data[,Meno]==PRE,Meno]))
[1] 22

I would appreciate any help in modifying the data= parameter of the lm
so that I include only those subjects for which Meno=PRE.

R 2.3.1
Windows XP

Thanks,
John

John Sorkin M.D., Ph.D.
Chief, Biostatistics and Informatics
Baltimore VA Medical Center GRECC,
University of Maryland School of Medicine Claude D. Pepper OAIC,
University of Maryland Clinical Nutrition Research Unit, and
Baltimore VA Center Stroke of Excellence

University of Maryland School of Medicine
Division of Gerontology
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524

(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing)
[EMAIL PROTECTED]

Confidentiality Statement:
This email message, including any attachments, is for the so...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.