On Tue, 2009-08-04 at 18:37 +0100, Federico Calboli wrote: > On 4 Aug 2009, at 18:27, David Winsemius wrote: > > > Your first posting made me think that you were complaining that the > > fitted values were less than the raw values. Your second posting makes > > me think that you may be conflating the English word "less" with the > > word English "fewer". Many native speakers make the same error, but in > > this context it may be a critical problem for communicating what you > > are seeing (or not seeing). > > > > Perhaps you could be more expansive about what you see and what you > > expect with explicit attention to the numbers involved? Even better > > would be small *reproducible* example. > > Problem solved, I realised there are NAs in the data which I had > completely forgot about (serves me right for digging up old data to > add results to a paper). Without any irony or sarcasm, thanks for the > grammar correction, it might prove useful in the future.
You can fit your model with argument na.action = na.exclude to put back, in the correct place, the missingness. E.g. set.seed(123) X <- rnorm(100) Y <- 0.6 + (X * 0.5) + rnorm(100) ## simulate some missings in X X[sample(length(X), 5)] <- NA dat <- data.frame(X = X, Y = Y) mod1 <- lm(Y ~ X, data = dat) mod2 <- lm(Y ~ X, data = dat, na.action = na.exclude) length(fitted(mod1)) length(fitted(mod2)) nrow(dat) fitted(mod2) > length(fitted(mod1)) [1] 95 > length(fitted(mod2)) [1] 100 > nrow(dat) [1] 100 HTH G > Best, > > Federico > > > > > > -- > > David > > > > On Aug 4, 2009, at 12:51 PM, Federico Calboli wrote: > > > >> Actually, I tried doing > >> > >> data2 = unique(data) > >> mod = lm(y ~ x1 + ... + xn, data2) > >> fitted(mod) > >> > >> and I still get les fitted values than observations. > >> > >> Federico > >> > >> > >> On 4 Aug 2009, at 12:18, Federico Calboli wrote: > >> > >>> Hi All, > >>> > >>> I have some data where the dependent variable is a score, low (1:3) > >>> or > >>> high (8:9), and the independent variables are 21 genotypic markers. > >>> I'm fitting a logistic regression on the whole dataset after > >>> transforming the score to 0/1 and normal linear regression on the > >>> high > >>> and low subsets. > >>> > >>> I all cases I have a numer of cases of data 'duplications', i.e. > >>> different individuals with the same score and the same genotype at > >>> the > >>> 21 markers. > >>> > >>> When I do: > >>> > >>> mod$fitted.values I get a number of fitted values corresponding to > >>> the > >>> umber of unique lines in the dataset. Is there a way to have the > >>> fitted values match the observation, even though some are > >>> duplicated > >>> and so have the same fitted value? I could do it by hand but it's > >>> laborious and I'd venture there is a better way. > >>> > >>> Best, > >>> > >>> Federico > >>> > > > > David Winsemius, MD > > Heritage Laboratories > > West Hartford, CT > > > > -- > Federico C. F. Calboli > Department of Epidemiology and Public Health > Imperial College, St. Mary's Campus > Norfolk Place, London W2 1PG > > Tel +44 (0)20 75941602 Fax +44 (0)20 75943193 > > f.calboli [.a.t] imperial.ac.uk > f.calboli [.a.t] gmail.com > > ______________________________________________ > [email protected] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
signature.asc
Description: This is a digitally signed message part
______________________________________________ [email protected] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

