Re: [R] fitting a sinus curve
Dear all, Thanks a lot for the help. It worked very well in the end. Best regards, Marianne -- View this message in context: http://r.789695.n4.nabble.com/fitting-a-sinus-curve-tp3700833p3827044.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] fitting a sinus curve
Dear David and Hans- Werner, Thank you very much for your help. I would like to compare now if a polynomial or the sinus model fits better. How can I see R-squared or the F- Statistic for the sinus regression, so as to be able to compare it with the polynomial model? Thanks a lot and have a nice evening. Best, Mairanne -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Hans W Borchers Sent: Friday, July 29, 2011 12:21 PM To: r-h...@stat.math.ethz.ch Subject: Re: [R] fitting a sinus curve David Winsemius dwinsemius at comcast.net writes: On Jul 28, 2011, at 1:07 PM, Hans W Borchers wrote: maaariiianne marianne.zeyringer at ec.europa.eu writes: Dear R community! I am new to R and would be very grateful for any kind of help. I am a PhD student and need to fit a model to an electricity load profile of a household (curve with two peaks). I was thinking of looking if a polynomial of 4th order, a sinus/cosinus combination or a combination of 3 parabels fits the data best. I have problems with the sinus/cosinus regression: time - c( 0.00, 0.15, 0.30, 0.45, 1.00, 1.15, 1.30, 1.45, 2.00, 2.15, 2.30, 2.45, 3.00, 3.15, 3.30, 3.45, 4.00, 4.15, 4.30, 4.45, 5.00, 5.15, 5.30, 5.45, 6.00, 6.15, 6.30, 6.45, 7.00, 7.15, 7.30, 7.45, 8.00, 8.15, 8.30, 8.45, 9.00, 9.15, 9.30, 9.45, 10.00, 10.15, 10.30, 10.45, 11.00, 11.15, 11.30, 11.45, 12.00, 12.15, 12.30, 12.45, 13.00, 13.15, 13.30, 13.45, 14.00, 14.15, 14.30, 14.45, 15.00, 15.15, 15.30, 15.45, 16.00, 16.15, 16.30, 16.45, 17.00, 17.15, 17.30, 17.45, 18.00, 18.15, 18.30, 18.45, 19.00, 19.15, 19.30, 19.45, 20.00, 20.15, 20.30, 20.45, 21.00, 21.15, 21.30, 21.45, 22.00, 22.15, 22.30, 22.45, 23.00, 23.15, 23.30, 23.45) watt - c( 94.1, 70.8, 68.2, 65.9, 63.3, 59.5, 55, 50.5, 46.6, 43.9, 42.3, 41.4, 40.8, 40.3, 39.9, 39.5, 39.1, 38.8, 38.5, 38.3, 38.3, 38.5, 39.1, 40.3, 42.4, 45.6, 49.9, 55.3, 61.6, 68.9, 77.1, 86.1, 95.7, 105.8, 115.8, 124.9, 132.3, 137.6, 141.1, 143.3, 144.8, 146, 147.2, 148.4, 149.8, 151.5, 153.5, 156, 159, 162.4, 165.8, 168.4, 169.8, 169.4, 167.6, 164.8, 161.5, 158.1, 154.9, 151.8, 149, 146.5, 144.4, 142.7, 141.5, 140.9, 141.7, 144.9, 151.5, 161.9, 174.6, 187.4, 198.1, 205.2, 209.1, 211.1, 212.2, 213.2, 213, 210.4, 203.9, 192.9, 179, 164.4, 151.5, 141.9, 135.3, 131, 128.2, 126.1, 124.1, 121.6, 118.2, 113.4, 107.4, 100.8) df-data.frame(time, watt) lmfit - lm(time ~ watt + cos(time) + sin(time), data = df) Your regression formula does not make sense to me. You seem to expect a periodic function within 24 hours, and if not it would still be possible to subtract the trend and then look at a periodic solution. Applying a trigonometric regression results in the following approximations: library(pracma) plot(2*pi*time/24, watt, col=red) ts - seq(0, 2*pi, len = 100) xs6 - trigApprox(ts, watt, 6) xs8 - trigApprox(ts, watt, 8) lines(ts, xs6, col=blue, lwd=2) lines(ts, xs8, col=green, lwd=2) grid() where as examples the trigonometric fits of degree 6 and 8 are used. I would not advise to use higher orders, even if the fit is not perfect. Thank you ! That is a real gem of a worked example. Not only did it introduce me to a useful package I was not familiar with, but there was even a worked example in one of the help pages that might have specifically answered the question about getting a 2nd(?) order trig regression. If I understood the commentary on that page, this method might also be appropriate for an irregular time series, whereas trigApprox and trigPoly would not? That's true. For the moment, the trigPoly() function works correctly only with equidistant data between 0 and 2*pi. This is adapted from the trigPoly help page in Hans Werner's pracma package: The error I made myself was to take the 'time' variable literally, though obviously the numbers after the decimal point were meant as minutes. Thus time - seq(0, 23.75, len = 96) would be a better choice. The rest in your adaptation is absolutely correct. A - cbind(1, cos(pi*time/24), sin(pi*time/24), cos(2*pi*time/24), sin(2*pi*time/24)) (ab - qr.solve(A, watt)) # [1] 127.29131 -26.88824 -10.06134 -36.22793 -38.56219 ts - seq(0, pi, length.out = 100) xs - ab[1] + ab[2]*cos(ts) + ab[3]*sin(ts) + ab[4]*cos(2*ts) + ab[5]*sin(2*ts) plot(pi*time/24, watt, col = red, xlim=c(0, pi), ylim=range(watt), main = Trigonometric Regression) lines(ts, xs, col=blue) Hans: I corrected the spelling of Trigonometric, but other than that I may well have introduced other errors for which I would be happy to be corrected. For instance, I'm unsure of the terminology regarding the ordinality of this model. I'm also not sure if my pi/24 and 2*pi/24 factors were correct in normalizing the time scale, although the prediction seemed sensible. And yes, this curve is the best
Re: [R] fitting a sinus curve
Dear all, Thanks again for your help. I looked at a polynomial regression of 12th order and the sinus regression and to me the polynomial model seems to fit better. What do you think? Thanks a lot, Mariannne time - c(0.15,0.30,0.45,1.00,1.15,1.30,1.45,2.00,2.15,2.30,2.45,3.00,3.15,3.30,3.45,4.00,4.15,4.30,4.45,5.00,5.15,5.30,5.45,6.00,6.15,6.30,6.45,7.00,7.15,7.30,7.45,8.00,8.15,8.30,8.45,9.00,9.15,9.30,9.45,10.00,10.15,10.30,10.45,11.00,11.15,11.30,11.45,12.00,12.15,12.30,12.45,13.00,13.15,13.30,13.45,14.00,14.15,14.30,14.45,15.00,15.15,15.30,15.45,16.00,16.15,16.30,16.45,17.00,17.15,17.30,17.45,18.00,18.15,18.30,18.45,19.00,19.15,19.30,19.45,20.00,20.15,20.30,20.45,21.00,21.15,21.30,21.45,22.00,22.15,22.30,22.45,23.00,23.15,23.30,23.45,0.00) watt - c(70.8,68.2,65.9,63.3,59.5,55,50.5,46.6,43.9,42.3,41.4,40.8,40.3,39.9,39.5,39.1,38.8,38.5,38.3,38.3,38.5,39.1,40.3,42.4,45.6,49.9,55.3,61.6,68.9,77.1,86.1,95.7,105.8,115.8,124.9,132.3,137.6,141.1,143.3,144.8,146,147.2,148.4,149.8,151.5,153.5,156,159,162.4,165.8,168.4,169.8,169.4,167.6,164.8,161.5,158.1,154.9,151.8,149,146.5,144.4,142.7,141.5,140.9,141.7,144.9,151.5,161.9,174.6,187.4,198.1,205.2,209.1,211.1,212.2,213.2,213,210.4,203.9,192.9,179,164.4,151.5,141.9,135.3,131,128.2,126.1,124.1,121.6,118.2,113.4,107.4,100.8,94.1) df - data.frame(conc, vel) df polyfit12 - lm(watt~time+I(time^2)+I(time^3)+I(time^4)+I(time^5)+I(time^6)+I(time^7)+I(time^8)+I(time^9)+I(time^10)+I(time^11)+(time^12)), data=df) coef12-coef(polyfit12) coef12 plot(df$conc, df$vel, xlim=c(0,23.45), ylim=c(0,220)) x - seq(0, 23.34,length=100) y3 - cbind(1,x,x^2,x^3,x^4,x^5,x^6,x^7,x^8,x^9,x^10,x^11,x^12) %*% coef12 lines(x,y3) I am slowly transferring things to a new computer when I found this Snews post from Bill Venables - it comes in handy from time to time. http://www.biostat.wustl.edu/archives/html/s-news/1999-09/msg00059.html sin.cos.fcn - function(k, time) { X - matrix(0, length(time), 2*k) for(i in 1:k) { X[,i] - sin(i*2*pi*time/96) X[,k+i] - cos(i*2*pi*time/96) } X } mod8 - lm(watt ~ sin.cos.fcn(8, time/24)) mod8.pred - predict(mod8) mod24 - lm(watt ~ sin.cos.fcn(24, time/24)) mod24.pred - predict(mod24) so replotting plot(pi*time/24, watt, col = red, xlim=c(0, pi), ylim=range(watt), main = Trigonometric Regression) lines(pi*time/24, mod8.pred, col=cyan) lines(pi*time/24, mod24.pred, col=blue) It depends on how well you want to go close to the data and will need some refinement. Now for some sustenance! Regards Duncan Duncan Mackay Department of Agronomy and Soil Science University of New England ARMIDALE NSW 2351 Email: home mac...@northnet.com.au At 20:21 29/07/2011, you wrote: David Winsemius dwinsemius at comcast.net writes: On Jul 28, 2011, at 1:07 PM, Hans W Borchers wrote: maaariiianne marianne.zeyringer at ec.europa.eu writes: Dear R community! I am new to R and would be very grateful for any kind of help. I am a PhD student and need to fit a model to an electricity load profile of a household (curve with two peaks). I was thinking of looking if a polynomial of 4th order, a sinus/cosinus combination or a combination of 3 parabels fits the data best. I have problems with the sinus/cosinus regression: time - c( 0.00, 0.15, 0.30, 0.45, 1.00, 1.15, 1.30, 1.45, 2.00, 2.15, 2.30, 2.45, 3.00, 3.15, 3.30, 3.45, 4.00, 4.15, 4.30, 4.45, 5.00, 5.15, 5.30, 5.45, 6.00, 6.15, 6.30, 6.45, 7.00, 7.15, 7.30, 7.45, 8.00, 8.15, 8.30, 8.45, 9.00, 9.15, 9.30, 9.45, 10.00, 10.15, 10.30, 10.45, 11.00, 11.15, 11.30, 11.45, 12.00, 12.15, 12.30, 12.45, 13.00, 13.15, 13.30, 13.45, 14.00, 14.15, 14.30, 14.45, 15.00, 15.15, 15.30, 15.45, 16.00, 16.15, 16.30, 16.45, 17.00, 17.15, 17.30, 17.45, 18.00, 18.15, 18.30, 18.45, 19.00, 19.15, 19.30, 19.45, 20.00, 20.15, 20.30, 20.45, 21.00, 21.15, 21.30, 21.45, 22.00, 22.15, 22.30, 22.45, 23.00, 23.15, 23.30, 23.45) watt - c( 94.1, 70.8, 68.2, 65.9, 63.3, 59.5, 55, 50.5, 46.6, 43.9, 42.3, 41.4, 40.8, 40.3, 39.9, 39.5, 39.1, 38.8, 38.5, 38.3, 38.3, 38.5, 39.1, 40.3, 42.4, 45.6, 49.9, 55.3, 61.6, 68.9, 77.1, 86.1, 95.7, 105.8, 115.8, 124.9, 132.3, 137.6, 141.1, 143.3, 144.8, 146, 147.2, 148.4, 149.8, 151.5, 153.5, 156, 159, 162.4, 165.8, 168.4, 169.8, 169.4, 167.6, 164.8, 161.5, 158.1, 154.9, 151.8, 149, 146.5, 144.4, 142.7, 141.5, 140.9, 141.7, 144.9, 151.5, 161.9, 174.6, 187.4, 198.1, 205.2, 209.1, 211.1, 212.2, 213.2, 213, 210.4, 203.9, 192.9, 179, 164.4, 151.5, 141.9, 135.3, 131, 128.2, 126.1, 124.1, 121.6, 118.2, 113.4, 107.4, 100.8) df-data.frame(time, watt) lmfit - lm(time ~ watt + cos(time) + sin(time), data = df) Your regression formula does not make sense to me. You seem to expect a periodic function within 24 hours, and if not it would still be possible to subtract the trend and then look at a periodic solution. Applying a trigonometric regression results in the
Re: [R] fitting a sinus curve
I am slowly transferring things to a new computer when I found this Snews post from Bill Venables - it comes in handy from time to time. http://www.biostat.wustl.edu/archives/html/s-news/1999-09/msg00059.html sin.cos.fcn - function(k, time) { X - matrix(0, length(time), 2*k) for(i in 1:k) { X[,i] - sin(i*2*pi*time/96) X[,k+i] - cos(i*2*pi*time/96) } X } mod8 - lm(watt ~ sin.cos.fcn(8, time/24)) mod8.pred - predict(mod8) mod24 - lm(watt ~ sin.cos.fcn(24, time/24)) mod24.pred - predict(mod24) so replotting plot(pi*time/24, watt, col = red, xlim=c(0, pi), ylim=range(watt), main = Trigonometric Regression) lines(pi*time/24, mod8.pred, col=cyan) lines(pi*time/24, mod24.pred, col=blue) It depends on how well you want to go close to the data and will need some refinement. Now for some sustenance! Regards Duncan Duncan Mackay Department of Agronomy and Soil Science University of New England ARMIDALE NSW 2351 Email: home mac...@northnet.com.au At 20:21 29/07/2011, you wrote: David Winsemius dwinsemius at comcast.net writes: On Jul 28, 2011, at 1:07 PM, Hans W Borchers wrote: maaariiianne marianne.zeyringer at ec.europa.eu writes: Dear R community! I am new to R and would be very grateful for any kind of help. I am a PhD student and need to fit a model to an electricity load profile of a household (curve with two peaks). I was thinking of looking if a polynomial of 4th order, a sinus/cosinus combination or a combination of 3 parabels fits the data best. I have problems with the sinus/cosinus regression: time - c( 0.00, 0.15, 0.30, 0.45, 1.00, 1.15, 1.30, 1.45, 2.00, 2.15, 2.30, 2.45, 3.00, 3.15, 3.30, 3.45, 4.00, 4.15, 4.30, 4.45, 5.00, 5.15, 5.30, 5.45, 6.00, 6.15, 6.30, 6.45, 7.00, 7.15, 7.30, 7.45, 8.00, 8.15, 8.30, 8.45, 9.00, 9.15, 9.30, 9.45, 10.00, 10.15, 10.30, 10.45, 11.00, 11.15, 11.30, 11.45, 12.00, 12.15, 12.30, 12.45, 13.00, 13.15, 13.30, 13.45, 14.00, 14.15, 14.30, 14.45, 15.00, 15.15, 15.30, 15.45, 16.00, 16.15, 16.30, 16.45, 17.00, 17.15, 17.30, 17.45, 18.00, 18.15, 18.30, 18.45, 19.00, 19.15, 19.30, 19.45, 20.00, 20.15, 20.30, 20.45, 21.00, 21.15, 21.30, 21.45, 22.00, 22.15, 22.30, 22.45, 23.00, 23.15, 23.30, 23.45) watt - c( 94.1, 70.8, 68.2, 65.9, 63.3, 59.5, 55, 50.5, 46.6, 43.9, 42.3, 41.4, 40.8, 40.3, 39.9, 39.5, 39.1, 38.8, 38.5, 38.3, 38.3, 38.5, 39.1, 40.3, 42.4, 45.6, 49.9, 55.3, 61.6, 68.9, 77.1, 86.1, 95.7, 105.8, 115.8, 124.9, 132.3, 137.6, 141.1, 143.3, 144.8, 146, 147.2, 148.4, 149.8, 151.5, 153.5, 156, 159, 162.4, 165.8, 168.4, 169.8, 169.4, 167.6, 164.8, 161.5, 158.1, 154.9, 151.8, 149, 146.5, 144.4, 142.7, 141.5, 140.9, 141.7, 144.9, 151.5, 161.9, 174.6, 187.4, 198.1, 205.2, 209.1, 211.1, 212.2, 213.2, 213, 210.4, 203.9, 192.9, 179, 164.4, 151.5, 141.9, 135.3, 131, 128.2, 126.1, 124.1, 121.6, 118.2, 113.4, 107.4, 100.8) df-data.frame(time, watt) lmfit - lm(time ~ watt + cos(time) + sin(time), data = df) Your regression formula does not make sense to me. You seem to expect a periodic function within 24 hours, and if not it would still be possible to subtract the trend and then look at a periodic solution. Applying a trigonometric regression results in the following approximations: library(pracma) plot(2*pi*time/24, watt, col=red) ts - seq(0, 2*pi, len = 100) xs6 - trigApprox(ts, watt, 6) xs8 - trigApprox(ts, watt, 8) lines(ts, xs6, col=blue, lwd=2) lines(ts, xs8, col=green, lwd=2) grid() where as examples the trigonometric fits of degree 6 and 8 are used. I would not advise to use higher orders, even if the fit is not perfect. Thank you ! That is a real gem of a worked example. Not only did it introduce me to a useful package I was not familiar with, but there was even a worked example in one of the help pages that might have specifically answered the question about getting a 2nd(?) order trig regression. If I understood the commentary on that page, this method might also be appropriate for an irregular time series, whereas trigApprox and trigPoly would not? That's true. For the moment, the trigPoly() function works correctly only with equidistant data between 0 and 2*pi. This is adapted from the trigPoly help page in Hans Werner's pracma package: The error I made myself was to take the 'time' variable literally, though obviously the numbers after the decimal point were meant as minutes. Thus time - seq(0, 23.75, len = 96) would be a better choice. The rest in your adaptation is absolutely correct. A - cbind(1, cos(pi*time/24), sin(pi*time/24), cos(2*pi*time/24), sin(2*pi*time/24)) (ab - qr.solve(A, watt)) # [1] 127.29131 -26.88824 -10.06134 -36.22793 -38.56219 ts - seq(0, pi, length.out = 100) xs - ab[1] + ab[2]*cos(ts) + ab[3]*sin(ts) + ab[4]*cos(2*ts) +
Re: [R] fitting a sinus curve
David Winsemius dwinsemius at comcast.net writes: On Jul 28, 2011, at 1:07 PM, Hans W Borchers wrote: maaariiianne marianne.zeyringer at ec.europa.eu writes: Dear R community! I am new to R and would be very grateful for any kind of help. I am a PhD student and need to fit a model to an electricity load profile of a household (curve with two peaks). I was thinking of looking if a polynomial of 4th order, a sinus/cosinus combination or a combination of 3 parabels fits the data best. I have problems with the sinus/cosinus regression: time - c( 0.00, 0.15, 0.30, 0.45, 1.00, 1.15, 1.30, 1.45, 2.00, 2.15, 2.30, 2.45, 3.00, 3.15, 3.30, 3.45, 4.00, 4.15, 4.30, 4.45, 5.00, 5.15, 5.30, 5.45, 6.00, 6.15, 6.30, 6.45, 7.00, 7.15, 7.30, 7.45, 8.00, 8.15, 8.30, 8.45, 9.00, 9.15, 9.30, 9.45, 10.00, 10.15, 10.30, 10.45, 11.00, 11.15, 11.30, 11.45, 12.00, 12.15, 12.30, 12.45, 13.00, 13.15, 13.30, 13.45, 14.00, 14.15, 14.30, 14.45, 15.00, 15.15, 15.30, 15.45, 16.00, 16.15, 16.30, 16.45, 17.00, 17.15, 17.30, 17.45, 18.00, 18.15, 18.30, 18.45, 19.00, 19.15, 19.30, 19.45, 20.00, 20.15, 20.30, 20.45, 21.00, 21.15, 21.30, 21.45, 22.00, 22.15, 22.30, 22.45, 23.00, 23.15, 23.30, 23.45) watt - c( 94.1, 70.8, 68.2, 65.9, 63.3, 59.5, 55, 50.5, 46.6, 43.9, 42.3, 41.4, 40.8, 40.3, 39.9, 39.5, 39.1, 38.8, 38.5, 38.3, 38.3, 38.5, 39.1, 40.3, 42.4, 45.6, 49.9, 55.3, 61.6, 68.9, 77.1, 86.1, 95.7, 105.8, 115.8, 124.9, 132.3, 137.6, 141.1, 143.3, 144.8, 146, 147.2, 148.4, 149.8, 151.5, 153.5, 156, 159, 162.4, 165.8, 168.4, 169.8, 169.4, 167.6, 164.8, 161.5, 158.1, 154.9, 151.8, 149, 146.5, 144.4, 142.7, 141.5, 140.9, 141.7, 144.9, 151.5, 161.9, 174.6, 187.4, 198.1, 205.2, 209.1, 211.1, 212.2, 213.2, 213, 210.4, 203.9, 192.9, 179, 164.4, 151.5, 141.9, 135.3, 131, 128.2, 126.1, 124.1, 121.6, 118.2, 113.4, 107.4, 100.8) df-data.frame(time, watt) lmfit - lm(time ~ watt + cos(time) + sin(time), data = df) Your regression formula does not make sense to me. You seem to expect a periodic function within 24 hours, and if not it would still be possible to subtract the trend and then look at a periodic solution. Applying a trigonometric regression results in the following approximations: library(pracma) plot(2*pi*time/24, watt, col=red) ts - seq(0, 2*pi, len = 100) xs6 - trigApprox(ts, watt, 6) xs8 - trigApprox(ts, watt, 8) lines(ts, xs6, col=blue, lwd=2) lines(ts, xs8, col=green, lwd=2) grid() where as examples the trigonometric fits of degree 6 and 8 are used. I would not advise to use higher orders, even if the fit is not perfect. Thank you ! That is a real gem of a worked example. Not only did it introduce me to a useful package I was not familiar with, but there was even a worked example in one of the help pages that might have specifically answered the question about getting a 2nd(?) order trig regression. If I understood the commentary on that page, this method might also be appropriate for an irregular time series, whereas trigApprox and trigPoly would not? That's true. For the moment, the trigPoly() function works correctly only with equidistant data between 0 and 2*pi. This is adapted from the trigPoly help page in Hans Werner's pracma package: The error I made myself was to take the 'time' variable literally, though obviously the numbers after the decimal point were meant as minutes. Thus time - seq(0, 23.75, len = 96) would be a better choice. The rest in your adaptation is absolutely correct. A - cbind(1, cos(pi*time/24), sin(pi*time/24), cos(2*pi*time/24), sin(2*pi*time/24)) (ab - qr.solve(A, watt)) # [1] 127.29131 -26.88824 -10.06134 -36.22793 -38.56219 ts - seq(0, pi, length.out = 100) xs - ab[1] + ab[2]*cos(ts) + ab[3]*sin(ts) + ab[4]*cos(2*ts) + ab[5]*sin(2*ts) plot(pi*time/24, watt, col = red, xlim=c(0, pi), ylim=range(watt), main = Trigonometric Regression) lines(ts, xs, col=blue) Hans: I corrected the spelling of Trigonometric, but other than that I may well have introduced other errors for which I would be happy to be corrected. For instance, I'm unsure of the terminology regarding the ordinality of this model. I'm also not sure if my pi/24 and 2*pi/24 factors were correct in normalizing the time scale, although the prediction seemed sensible. And yes, this curve is the best trigonometric approximation you can get for this order(?). You will see the same result when you apply and plot xs1 - trigApprox(ts, watt, 1) But I see your problem with the term 'order' I will have a closer look at this and clarify the terminology on the help page. [All this reminds me of an article in the Mathematical Intelligencer some years ago where it was convincingly argued that the universal constant \pi should have the value 2*pi (in today's notation).] Thanks, Hans Werner Hans Werner
Re: [R] fitting a sinus curve
maaariiianne marianne.zeyringer at ec.europa.eu writes: Dear R community! I am new to R and would be very grateful for any kind of help. I am a PhD student and need to fit a model to an electricity load profile of a household (curve with two peaks). I was thinking of looking if a polynomial of 4th order, a sinus/cosinus combination or a combination of 3 parabels fits the data best. I have problems with the sinus/cosinus regression: time - c( 0.00, 0.15, 0.30, 0.45, 1.00, 1.15, 1.30, 1.45, 2.00, 2.15, 2.30, 2.45, 3.00, 3.15, 3.30, 3.45, 4.00, 4.15, 4.30, 4.45, 5.00, 5.15, 5.30, 5.45, 6.00, 6.15, 6.30, 6.45, 7.00, 7.15, 7.30, 7.45, 8.00, 8.15, 8.30, 8.45, 9.00, 9.15, 9.30, 9.45, 10.00, 10.15, 10.30, 10.45, 11.00, 11.15, 11.30, 11.45, 12.00, 12.15, 12.30, 12.45, 13.00, 13.15, 13.30, 13.45, 14.00, 14.15, 14.30, 14.45, 15.00, 15.15, 15.30, 15.45, 16.00, 16.15, 16.30, 16.45, 17.00, 17.15, 17.30, 17.45, 18.00, 18.15, 18.30, 18.45, 19.00, 19.15, 19.30, 19.45, 20.00, 20.15, 20.30, 20.45, 21.00, 21.15, 21.30, 21.45, 22.00, 22.15, 22.30, 22.45, 23.00, 23.15, 23.30, 23.45) watt - c( 94.1, 70.8, 68.2, 65.9, 63.3, 59.5, 55, 50.5, 46.6, 43.9, 42.3, 41.4, 40.8, 40.3, 39.9, 39.5, 39.1, 38.8, 38.5, 38.3, 38.3, 38.5, 39.1, 40.3, 42.4, 45.6, 49.9, 55.3, 61.6, 68.9, 77.1, 86.1, 95.7, 105.8, 115.8, 124.9, 132.3, 137.6, 141.1, 143.3, 144.8, 146, 147.2, 148.4, 149.8, 151.5, 153.5, 156, 159, 162.4, 165.8, 168.4, 169.8, 169.4, 167.6, 164.8, 161.5, 158.1, 154.9, 151.8, 149, 146.5, 144.4, 142.7, 141.5, 140.9, 141.7, 144.9, 151.5, 161.9, 174.6, 187.4, 198.1, 205.2, 209.1, 211.1, 212.2, 213.2, 213, 210.4, 203.9, 192.9, 179, 164.4, 151.5, 141.9, 135.3, 131, 128.2, 126.1, 124.1, 121.6, 118.2, 113.4, 107.4, 100.8) df-data.frame(time, watt) lmfit - lm(time ~ watt + cos(time) + sin(time), data = df) Your regression formula does not make sense to me. You seem to expect a periodic function within 24 hours, and if not it would still be possible to subtract the trend and then look at a periodic solution. Applying a trigonometric regression results in the following approximations: library(pracma) plot(2*pi*time/24, watt, col=red) ts - seq(0, 2*pi, len = 100) xs6 - trigApprox(ts, watt, 6) xs8 - trigApprox(ts, watt, 8) lines(ts, xs6, col=blue, lwd=2) lines(ts, xs8, col=green, lwd=2) grid() where as examples the trigonometric fits of degree 6 and 8 are used. I would not advise to use higher orders, even if the fit is not perfect. Hans Werner Thanks a lot, Marianne __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] fitting a sinus curve
On Jul 28, 2011, at 1:07 PM, Hans W Borchers wrote: maaariiianne marianne.zeyringer at ec.europa.eu writes: Dear R community! I am new to R and would be very grateful for any kind of help. I am a PhD student and need to fit a model to an electricity load profile of a household (curve with two peaks). I was thinking of looking if a polynomial of 4th order, a sinus/cosinus combination or a combination of 3 parabels fits the data best. I have problems with the sinus/cosinus regression: time - c( 0.00, 0.15, 0.30, 0.45, 1.00, 1.15, 1.30, 1.45, 2.00, 2.15, 2.30, 2.45, 3.00, 3.15, 3.30, 3.45, 4.00, 4.15, 4.30, 4.45, 5.00, 5.15, 5.30, 5.45, 6.00, 6.15, 6.30, 6.45, 7.00, 7.15, 7.30, 7.45, 8.00, 8.15, 8.30, 8.45, 9.00, 9.15, 9.30, 9.45, 10.00, 10.15, 10.30, 10.45, 11.00, 11.15, 11.30, 11.45, 12.00, 12.15, 12.30, 12.45, 13.00, 13.15, 13.30, 13.45, 14.00, 14.15, 14.30, 14.45, 15.00, 15.15, 15.30, 15.45, 16.00, 16.15, 16.30, 16.45, 17.00, 17.15, 17.30, 17.45, 18.00, 18.15, 18.30, 18.45, 19.00, 19.15, 19.30, 19.45, 20.00, 20.15, 20.30, 20.45, 21.00, 21.15, 21.30, 21.45, 22.00, 22.15, 22.30, 22.45, 23.00, 23.15, 23.30, 23.45) watt - c( 94.1, 70.8, 68.2, 65.9, 63.3, 59.5, 55, 50.5, 46.6, 43.9, 42.3, 41.4, 40.8, 40.3, 39.9, 39.5, 39.1, 38.8, 38.5, 38.3, 38.3, 38.5, 39.1, 40.3, 42.4, 45.6, 49.9, 55.3, 61.6, 68.9, 77.1, 86.1, 95.7, 105.8, 115.8, 124.9, 132.3, 137.6, 141.1, 143.3, 144.8, 146, 147.2, 148.4, 149.8, 151.5, 153.5, 156, 159, 162.4, 165.8, 168.4, 169.8, 169.4, 167.6, 164.8, 161.5, 158.1, 154.9, 151.8, 149, 146.5, 144.4, 142.7, 141.5, 140.9, 141.7, 144.9, 151.5, 161.9, 174.6, 187.4, 198.1, 205.2, 209.1, 211.1, 212.2, 213.2, 213, 210.4, 203.9, 192.9, 179, 164.4, 151.5, 141.9, 135.3, 131, 128.2, 126.1, 124.1, 121.6, 118.2, 113.4, 107.4, 100.8) df-data.frame(time, watt) lmfit - lm(time ~ watt + cos(time) + sin(time), data = df) Your regression formula does not make sense to me. You seem to expect a periodic function within 24 hours, and if not it would still be possible to subtract the trend and then look at a periodic solution. Applying a trigonometric regression results in the following approximations: library(pracma) plot(2*pi*time/24, watt, col=red) ts - seq(0, 2*pi, len = 100) xs6 - trigApprox(ts, watt, 6) xs8 - trigApprox(ts, watt, 8) lines(ts, xs6, col=blue, lwd=2) lines(ts, xs8, col=green, lwd=2) grid() where as examples the trigonometric fits of degree 6 and 8 are used. I would not advise to use higher orders, even if the fit is not perfect. Thank you ! That is a real gem of a worked example. Not only did it introduce me to a useful package I was not familiar with, but there was even a worked example in one of the help pages that might have specifically answered the question about getting a 2nd(?) order trig regression. If I understood the commentary on that page, this method might also be appropriate for an irregular time series, whereas trigApprox and trigPoly would not? This is adapted from the trigPoly help page in Hans Werner's pracma package: A - cbind(1, cos(pi*time/24), sin(pi*time/24), cos(2*pi*time/24), sin(2*pi*time/24)) ab - qr.solve(A, watt) ab # [1] 127.29131 -26.88824 -10.06134 -36.22793 -38.56219 ts - seq(0, pi, length.out = 100) xs - ab[1] + ab[2]*cos(ts) + ab[3]*sin(ts) + ab[4]*cos(2*ts) +ab[5]*sin(2*ts) plot(pi*time/24, watt, col = red, xlim=c(0, pi), ylim=range(watt), main = Trigonometric Regression) lines(ts, xs, col=blue) Hans: I corrected the spelling of Trigonometric, but other than that I may well have introduced other errors for which I would be happy to be corrected. For instance, I'm unsure of the terminology regarding the ordinality of this model. I'm also not sure if my pi/24 and 2*pi/24 factors were correct in normalizing the time scale, although the prediction seemed sensible. Hans Werner Thanks a lot, Marianne -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.