Ahh.... sorry -- I didn't understand that x was supposed to be an index so I was using the row number an index for the summation -- yes, my proposal probably won't work without further assumptions....[I.e., you could assume linear growth between observations, but that will bias something some direction...(not sure which)]
I'll ponder it some more and get back to you if I come up with anything Michael On Tue, May 22, 2012 at 12:43 PM, Robbie Edwards <robbie.edwa...@gmail.com> wrote: > I don't think I can. > > For the sample data > > d <- data.frame(x=c(1, 4, 9, 12), s=c(109, 1200, 5325, 8216)) > > when x = 4, s = 1200. However, that s4 is sum of y1 + y2 + y3 + y4. > Wouldn't I have to know the y for x = 2 and x = 3 to get the value of y > for x = 4? > > In the previous message, I created two sample data frames. d is what I'm > trying to use to create df. I only know what's in d, df is just used to > illustrate what I'm trying to get from d. > > robbie > > > > > > On Tue, May 22, 2012 at 12:30 PM, R. Michael Weylandt < > michael.weyla...@gmail.com> wrote: > >> But if I understand your problem correctly, you can get the y values >> from the s values. I'm relying on your statement that "s is sum of the >> current y and all previous y (s3 = y1 + y2 + y3)." E.g., >> >> y <- c(1, 4, 6, 9, 3, 7) >> >> s1 = 1 >> s2 = 4 + s1 = 5 >> s3 = 6 + s2 = 11 >> >> more generally >> >> s <- cumsum(y) >> >> Then if we only see s, we can get back the y vector by doing >> >> c(s[1], diff(s)) >> >> which is identical to y. >> >> So for your data, the underlying y must have been c(109, 1091, 4125, >> 2891) right? >> >> Or have I completely misunderstood your problem? >> >> Michael >> >> On Tue, May 22, 2012 at 12:25 PM, Robbie Edwards >> <robbie.edwa...@gmail.com> wrote: >> > Actually, I can't. I don't know the y values. Only the s and only for a >> > subset of the data. >> > >> > Like this. >> > >> > d <- data.frame(x=c(1, 4, 9, 12), s=c(109, 1200, 5325, 8216)) >> > >> > >> > >> > On Tue, May 22, 2012 at 11:57 AM, R. Michael Weylandt >> > <michael.weyla...@gmail.com> wrote: >> >> >> >> You can reconstruct the y values by taking first-differences of the s >> >> vector, no? Then it sounds like you're good to go >> >> >> >> Best, Michael >> >> >> >> On Tue, May 22, 2012 at 11:40 AM, Robbie Edwards >> >> <robbie.edwa...@gmail.com> wrote: >> >> > Hi all, >> >> > >> >> > Thanks for the replies, but I realize I've done a bad job explaining >> my >> >> > problem. To help, I've created some sample data to explain the >> problem. >> >> > >> >> > df <- data.frame(x=c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12), y=c(109, >> >> > 232, >> >> > 363, 496, 625, 744, 847, 928, 981, 1000, 979, 912), s=c(109, 341, 704, >> >> > 1200, 1825, 2569, 3416, 4344, 5325, 6325, 7304, 8216)) >> >> > >> >> > In this data frame, y results from y = x * b1 + x^2 * b2 + x^3 * b3 >> and >> >> > s >> >> > is sum of the current y and all previous y (s3 = y1 + y2 + y3). >> >> > >> >> > I know I can find b1, b2 and b3 using: >> >> > lm(y ~ 0 + x + I(x^2) + I(x^3), data=df) >> >> > >> >> > yielding... >> >> > Coefficients: >> >> > x I(x^2) I(x^3) >> >> > 100 10 -1 >> >> > >> >> > However, I need to find b1, b2 and b3 using the s column. The reason >> >> > being, I don't actually know the values of y in the actual data set. >> >> > And >> >> > in the actual data, I only have a few of the values. Imagine this >> data >> >> > is >> >> > being used a reward schedule for like a loyalty points program. y >> >> > represents the number of points needed for each level while s is the >> >> > total >> >> > number of points to reach that level. In the real problem, my data >> >> > looks >> >> > more like this: >> >> > >> >> > d <- data.frame(x=c(1, 4, 9, 12), s=c(109, 1200, 5325, 8216)) >> >> > >> >> > Where I need to use a few sample points to help define the parameters >> of >> >> > the curve. >> >> > >> >> > thanks again and hopefully this makes the problem a bit clearer. >> >> > >> >> > robbie >> >> > >> >> > >> >> > >> >> > On Fri, May 18, 2012 at 7:40 PM, David Winsemius >> >> > <dwinsem...@comcast.net>wrote: >> >> > >> >> >> >> >> >> On May 18, 2012, at 1:44 PM, Robbie Edwards wrote: >> >> >> >> >> >> Hi all, >> >> >>> >> >> >>> I'm trying to model some data where the y is defined by >> >> >>> >> >> >>> y = summation[1 to 50] B1 * x + B2 * x^2 + B3 * x^3 >> >> >>> >> >> >>> Hopefully that reads clearly for email. >> >> >>> >> >> >>> >> >> >> cumsum( rowSums( cbind(B1 * x, B2 * x^2, B3 * x^3))) >> >> >> >> >> >> >> >> >> >> >> >> Anyway, if it wasn't for the summation, I know I would do it like >> this >> >> >>> >> >> >>> lm(y ~ x + x2 + x3) >> >> >>> >> >> >>> Where x2 and x3 are x^2 and x^3. >> >> >>> >> >> >>> However, since each value of x is related to the previous values of >> x, >> >> >>> I >> >> >>> don't know how to do this. Any help is greatly appreciated. >> >> >>> >> >> >>> >> >> >>> >> >> >> >> >> >> David Winsemius, MD >> >> >> West Hartford, CT >> >> >> >> >> >> >> >> > >> >> > [[alternative HTML version deleted]] >> >> > >> >> > ______________________________________________ >> >> > R-help@r-project.org mailing list >> >> > https://stat.ethz.ch/mailman/listinfo/r-help >> >> > PLEASE do read the posting guide >> >> > http://www.R-project.org/posting-guide.html >> >> > and provide commented, minimal, self-contained, reproducible code. >> > >> > >> > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.