_______________________________________________________________________________________
The problem appears to be in how your original data has several tied values: > table(x) x 1.8 2 2.2 2.4 2.6 2.8 3 3.2 3.4 3.6 4 1 2 2 2 5 7 2 3 1 2 1 IIRC the maths and programming behind loess assume unique values for the predictor. One way to get around this is to jitter your data: > x2 <- jitter(x) > modj <- loess(y ~ x2, span=.5, degree=1) > predict(modj, data.frame(x=X)) [1] 3.156192 3.141705 3.126918 3.112996 3.101108 3.087696 3.063471 3.038609 3.024639 3.032585 3.059480 3.091774 [13] 3.115763 3.117743 3.092979 3.040798 2.988283 2.957976 2.950648 3.008358 3.070065 3.127379 3.193501 3.149428 [25] 3.082843 3.010998 2.939407 2.888213 2.841487 2.812815 2.801583 2.807181 2.837887 2.899130 2.978165 3.062088 [37] 3.137995 3.204628 3.271813 3.339450 3.407396 3.475510 3.543843 3.612450 3.681267 3.750227 3.819267 3.888321 [49] 3.957324 4.026212 Another way is to summarise your data using table() and aggregate(), and fit a weighted model where the weights are the counts for each unique x-value: > dtab <- aggregate(data.frame(y=y), by=list(x=x), FUN=mean) > dtab$x <- as.numeric(as.character(dtab$x)) > dtab$w <- table(x) > modt <- loess(y ~ x, span=.5, degree=1, weights=w, data=dtab) > predict(modt, data.frame(x=X)) [1] 3.186959 3.163133 3.136244 3.110822 3.091396 3.076705 3.047705 3.018362 3.007143 3.032246 3.069599 3.092369 [13] 3.098049 3.084134 3.053633 3.027429 3.012429 3.013908 3.036517 3.060372 3.076116 3.086870 3.095758 3.097287 [25] 3.073824 3.031238 2.976659 2.917402 2.863489 2.821469 2.796398 2.793336 2.823850 2.892363 2.980322 3.068725 [37] 3.140843 3.208920 3.279124 3.351965 3.427952 3.504330 3.577149 3.647119 3.714984 3.781486 3.847369 3.913375 [49] 3.980249 4.048733 There's probably a way to make the aggregate and table calls neater. -- Hong Ooi Senior Research Analyst, IAG Limited 388 George St, Sydney NSW 2000 +61 (2) 9292 1566 -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Leo Gürtler Sent: Wednesday, 7 December 2005 8:10 AM To: r-help@stat.math.ethz.ch Cc: [EMAIL PROTECTED] Subject: Re: [R] strange behavior of loess() & predict() Gavin Simpson wrote: Dear list, I am very sorry for being inaccurate in my question. But re-reading the predict.loess help site does not provide a solution. As long as predict is used on a new dataset based on this dataset, the strange values remain and can be reproduced. Adding a new element to both vectors (at the beginning, e.g. "1" for each vector) results in plausible values - but not in every case. Even switching x and y is sufficient (i.e. x as predictor and y as dependent variable). So my question is: Is it normal - or under which conditions does it take place - that predict.loess predicts values that are almost 20000/max(y) ~ 5000 times higher than expected? best, leo gürtler >On Tue, 2005-12-06 at 18:09 +0100, Leo Gürtler wrote: > > >>Dear altogether, >> >> ><snip> > > >># here is the difference!! >>predict(mod, data.frame(x=X), se=TRUE) >>predict(mod, x=X, se=TRUE) >> >> >><--- end of snip ---> >> >>I assume this has some reason but I do not understand this reason. >>Merci, >> >> > >Not sure if this is the reason, but there is no argument x in >predict.loess, and: > >a <- predict(mod, se = TRUE) > >gives you the same results as: > >b <- predict(mod, x=X, se=TRUE) > >so the x argument appears to be being passed on/in the ... arguments and >ignored? As such, you have no newdata, so mod$x is used. > >Now, when you do: > >c <- predict(mod, data.frame(x=X), se=TRUE) > >You have used an un-named argument in position 2. R takes this to be >what you want to use for newdata and so works with this data rather than >the one in mod$x as in the first case: > ># now named second argument - gets ignored as in a and b >d <- predict(mod, x = data.frame(x=X), se=TRUE) > >all.equal(a, b) # TRUE >all.equal(a, c) # FALSE >all.equal(a, d) # TRUE > ># this time we assign X to x by using (), the result is used as newdata >e <- predict(mod, (x=X), se=TRUE) > >all.equal(c, e) # TRUE > >If in doubt, name your arguments and check the help! ?predict.loess >would have quickly shown you where the problem lay. > >HTH > >G > > > >>best regards >> >>leo gürtler >> >>______________________________________________ >>R-help@stat.math.ethz.ch mailing list >>https://stat.ethz.ch/mailman/listinfo/r-help >>PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html >> >> -- email: [EMAIL PROTECTED] www: http://www.anicca-vijja.de/ ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html _______________________________________________________________________________________ The information transmitted in this message and its attachme...{{dropped}} ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html