Re: [R] time zones in POSIXt
I think this is a problem with difftime (which gets called when the subtraction in your example is invoked). The first few lines of difftime are: function (time1, time2, tz = , units = c(auto, secs, mins, hours, days, weeks)) { time1 - as.POSIXct(time1, tz = tz) time2 - as.POSIXct(time2, tz = tz) so any POSIXlt timezone gets clobbered with difftime's tz. Vadim Ogranovich vograno at evafunds.com writes: : : Hi, : : I have two data sources. One records time in PST time zone, the other in : GMT. I want to compute the difference between the two, but don't see : how. Here is an example where I compute time difference between : identical times each (meant to be) relative to its time zone. : : as.POSIXlt(2000-05-10 10:15:00, PST) - as.POSIXlt(2000-05-10 : 10:15:00, GMT) : Time difference of 0 secs : : I was expecting to see 8hrs (which is the time difference between London : and San-Francisco). Why is it so and what is the correct way of doing : it? : : I use R-1.8.1 on RH-7.3. : : Thanks, : Vadim : : : [[alternative HTML version deleted]] : : __ : R-help at stat.math.ethz.ch mailing list : https://www.stat.math.ethz.ch/mailman/listinfo/r-help : PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html : : __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] time zones in POSIXt
On Fri, 23 Apr 2004, Vadim Ogranovich wrote: Thank you for the lead, Dirk! Indeed this works on my machine too: as.POSIXct(2000-05-10 10:15:00, tz=PST8PDT) - as.POSIXct(2000-05-10 10:15:00, tz=GMT) Time difference of 7 hours However when I replace POSIXct by POSIXlt it breaks (this looks like a bug to me): as.POSIXlt(2000-05-10 10:15:00, tz=PST8PDT) - as.POSIXlt(2000-05-10 10:15:00, tz=GMT) Time difference of 0 secs No, it's not a bug. POSIXlt times are just numeric representations of broken-down times, and the tzone attribute (which is normally not present) is just a reminder to you. (It has not been checked, so this is avoid nasty surprises if it is wrong.) Maybe it was a bad idea to allow - for POSIXlt times, but then we do expect users to read the documentation (including the code if they are puzzled). Now a couple of new questions: * how could I learn about appropriate names for time zones? For example I was using PST whereas it seems I had to use either PST8 or PST8PDT. Why PST was not good? Is it documented anywhere? Specific to your OS, so I expect it documents it somewhere. * there seems to be no difference betweeen GMT and BST on my machine though PST8 and PST8PDT are treated properly: # PST8 is not identical to PST8PDT ISOdatetime(2003, seq(12), 1, 10, 0, 0, tz=PST8) - ISOdatetime(2003, seq(12), 1, 10, 0, 0, tz=PST8PDT) Time differences of0,0,0,0, 3600, 3600, 3600, 3600, 3600, 3600,0,0 secs # GMT0 is identical to BST ISOdatetime(2003, seq(12), 1, 10, 0, 0, tz=GMT0) - ISOdatetime(2003, seq(12), 1, 10, 0, 0, tz=BST) Time differences of 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 secs Why is such dichotomy? Ask your OS designer - timezones are handled by your OS. I would not consider BST to be a timezone as it is only defined for half the year. Windiows thinks I am on `GMT Daylight Time' although that usage is otherwise unknown in this country. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Fatal Error: INVALID HOMEDRIVE
On Fri, 23 Apr 2004, Brett Melbourne wrote: I also set HOMEPATH in my System control panel as you suggested, then rebooted ... the same error occurs. Windows XP appears to be over-riding the setting because no HOMEPATH appears in the environment even when it's set in the control panel thus. That was my original hypothesis. But R starts fine with C:\Program Files\R\rw1090\bin\Rgui.exe HOMEPATH=\ in the Target field of the icon. Better to set HOME there C:\Program Files\R\rw1090\bin\Rgui.exe HOME=C:\ in case HOMEDRIVE disappears too. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Moving window regressions - how can I improve this code?
1. ?nrow 2. ?all.vars Getting rid of loop was discussed on Mar 22 and there was some debate on whether or not it was a good idea although it turned out there was a bug in the code that only the loop free version brought out. See archives. Ajay Shah ajayshah at mayin.org writes: : : I wrote a function which does moving window regressions. E.g. if : there are 100 observations and the window width is 50, then I first : run the regression for observations 1..50, then for 2..51, and so on. : : I am extremely pleased with R in my experience with writing this, : since I was able to pass the model as an argument into the function : Forgive me if I sound naive, but that's rocket science to me!! : : For a regression with K explanatory variables, I make a matrix with : 2*K+2 columns, where I capture: : K coefficients and K standard errors : the residual sigma : R^2 : : My code is: : :# :movingWindowRegression - function(data, T, width, model, K) { : results = matrix(nrow=T, ncol=2*K+2) : for (i in width:T) { :details - summary.lm(lm(as.formula(model), data[(i-width+1):i,])) :n=1; :for (j in 1:K) { : results[i, n] = details$coefficients[j, 1] : results[i, n+1] = details$coefficients[j, 2] : n = n + 2 :} :results[i, n] = details$sigma :results[i, n+1] = details$r.squared : } : return(results) :} : :# Simulate some data for a linear regression :T = 20 :x = runif(T); y = 2 + 3*x + rnorm(T); :D = data.frame(x, y) : :r = movingWindowRegression(D, T=T, width=10, model=y ~ x, K=2) :print(r) :# : : I would be very happy if you could look at this and tell me how to do : things better. : : I have two specific questions: : : 1. I find it silly that I have to manually pass K and T into the : function. It would be so much nicer to have: : : r = movingWindowRegression(D, width=10, model=y ~ x) : instead of the existing : r = movingWindowRegression(D, T=T, width=10, model=y ~ x, K=2) : : How can the function inspect the data frame D and learn the : number of rows? : : How can the function inspect the model specification string and : learn K, the number of explanatory variables? : : 2. The R way consists of avoiding loops when the code is : vectorisable. I am using a loop to copy out from : details$coefficients into the columns of results[i,]. Is there a : better way? : __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Changing Gui preferences
On Sat, 24 Apr 2004, Murray Jorgensen wrote: I want to modify my Gui preferences to get rid of the MDI toolbar and status bar. [ Under Windows XP and R 1.8.1 and 1.9.0.] When I uncheck the boxes and click apply I am told to save the preferences and the changes will take effect when restarting R. I click save and a box comes up to ask me to save Rconsole (I'm not sure where this should be saved but I navigate my way to where the old Rconsole was and save on top of it. ?Rconsole tells you where the copy in use is saved. Strangely I am not asked if I want to replace the old Rconsole. That's programmable, but from this dialog box one would normally want to overwrite, no? Anyway when I re-start there are no changes and I am still cluttered up by the MDI bars that I want to get rid of! I think this might be another manifestation of that Microsoft bug, since the normal place to look is in the user's home directory. You need to make sure the copy you saved is being used. (I have just tested this, and it does still work.) -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] time zones in POSIXt
Prof Brian Ripley ripley at stats.ox.ac.uk writes: : However when I replace POSIXct by POSIXlt it breaks (this looks like a : bug to me): : : as.POSIXlt(2000-05-10 10:15:00, tz=PST8PDT) - : as.POSIXlt(2000-05-10 10:15:00, tz=GMT) : Time difference of 0 secs : : No, it's not a bug. POSIXlt times are just numeric representations of : broken-down times, and the tzone attribute (which is normally not present) : is just a reminder to you. (It has not been checked, so this is avoid : nasty surprises if it is wrong.) Maybe it was a bad idea to allow - for : POSIXlt times, but then we do expect users to read the documentation : (including the code if they are puzzled). This this is by design then its a bug by virtue of being a design error. __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] anova(my.lme.model, type=marginal) in Pinheiro and Bates
Dear All, I was recently discussing the use of anova(my.lme.model, type=marginal) with a colleague. I would like to find where the subject is discussed in Pinheiro and Bates 2000. I like the book because it is clear and accessible even for people with limited formal training in Math and Stats like me, but I failed to find the subject at hand in the index. I realize that reading the book cover to cover would do me a lot of good, but I would find it a bit impractical at the moment... can anyone please point out where in the book the use of marginal anova is discussed (if at all)? Regards, Federico Calboli -- = Federico C. F. Calboli Dipartimento di Biologia Via Selmi 3 40126 Bologna Italy tel (+39) 051 209 4187 fax (+39) 051 209 4286 f.calboli at ucl.ac.uk __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Data import, going from 8 to 550 columns
Hello all! I need to import a NIR dataset into R. It should be quite trivial, but I can´t make it work. (No problems with the text in the beginning, as # is recognised by read.table as the comment sign.) The thing I can´t get around is the CR that ends every line after column eight as the line in R should be 550 columns wide (including the JF-number). Every new line in R should begin with the JF2455 and so on. Naturally it is possible to re-shape the tables in Excel before import, but it is quite tedious and doesn´t feel right...! How do I make read.table to just go on reading on the next line when it comes to CR, and how do I make it use the double CR followed by a blank to begin the next line? The data-file(s) looks like this: #ID=Samples from soil scanning #SAMPLE_NUMBERS_PRESENT=Y #NX_VARIABLES=550 #NY_VARIABLES=0 #FIRST_WAVELENGTH=1300.00 #LAST_WAVELENGTH=2398.00 #WAVELENGTH_INCREMENT=2.00 JF2455 0.4367495 0.4365539 0.4363573 0.4361560 0.4359702 0.4357788 0.4355963 0.4354126 0.4352311 0.4350726 0.4349101 0.4347557 0.4346097 0.4344587 0.4343193 0.4341759 0.4340320 0.4338984 0.4337671 0.4336369 0.4335097 0.4333864 the original table is 8 columns wide, ended with a CR sixty four lines removed here 0.5015950 0.5020472 0.5026294 0.5033303 0.5041344 0.5049909 0.5059010 0.5067372 0.5075415 0.5082389 0.5089509 0.5095288 0.5101137 0.5106306 0.5111954 0.5116805 JF2456 0.3604568 0.3600681 0.3596676 0.3592694 0.3588919 0.3585098 0.3581379 0.3577725 0.3573992 0.3570563 0.3566975 0.3563588 0.3560365 0.3556931 0.3553730 0.3550543 0.3547286 0.3544230 0.3541073 0.3537982 0.3535004 0.3531921 0.3529077 0.3526271 0.3523493 0.3520919 0.3518271 0.3515673 0.3513192 0.3510693 0.3508208 0.3505693 ... and so on Thanks! /CG CG Pettersson, MSci, PhD Stud. Swedish University of Agricultural Sciences Dep. of Ecology and Crop Production. Box 7043 SE-750 07 Uppsala __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] Changing Gui preferences
Hi, a quick reply to help troubleshoot. The below (is minimal and) works on both R v1.8.1 and R v1.9.0 on WinXP Pro: From http://www.maths.lth.se/help/R/Rgui/: By default all plot windows (The Figure 2 etc windows) opened in Rgui are placed within the Rgui frame. Some people prefer this others don't. An option is to let the main window and all plot windows live separately. See screenshots to the right. This is done by selecting GUI preferences... in the Edit menu. Then select SDI (instead of MDI) next to Single or multiple windows. The click Save and accept the suggested pathname (Rconsole) by clicking Save in the opened file dialog. The click Cancel (yes, it is indeed ok). Restart R to get effectuate the settings. My R_USER and HOME as seen by R are: Sys.getenv(R_USER) R_USER C:\\Documents and Settings\\hb Sys.getenv(HOME) HOME Maybe it helps you on the way... Henrik Bengtsson -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Murray Jorgensen Sent: den 24 april 2004 11:05 To: Prof Brian Ripley; R-help Subject: Re: [R] Changing Gui preferences Prof Brian Ripley wrote: On Sat, 24 Apr 2004, Murray Jorgensen wrote: I want to modify my Gui preferences to get rid of the MDI toolbar and status bar. [ Under Windows XP and R 1.8.1 and 1.9.0.] When I uncheck the boxes and click apply I am told to save the preferences and the changes will take effect when restarting R. I click save and a box comes up to ask me to save Rconsole (I'm not sure where this should be saved but I navigate my way to where the old Rconsole was and save on top of it. ?Rconsole tells you where the copy in use is saved. What it says is: There are system copies of these files in 'R_HOME\etc'. Users can have personal copies of the files: these are looked for in the location given by the environment variable 'R_USER'. The system files are read only if a corresponding personal file is not found. If the environment variable 'R_USER' is not set, the R system sets it to 'HOME' if that is set (stripping any trailing slash), otherwise to '{HOMEDRIVE}{HOMEPATH}' if 'HOMEDRIVE' is set otherwise to the working directory. Now under Control PanelSystemAdvanced I find a box listing environmental variables. I find no R_HOME or R_USER listed there so presumably they are not set. I'm not sure how to find 'HOME' but I presume that it is the path that appears in the 'Change directory' box when you open this through the gui. In which case for me 'HOME' is C:\apps\R\rw1090 Now I have 3 versions of Rconsole near here. They are in C:\apps\R\rw1090\etc C:\apps\R\rw1090 and C:\apps\R I still don't know how to change my Gui preferences! [...] Murray -- Dr Murray Jorgensen http://www.stats.waikato.ac.nz/Staff/maj.html Department of Statistics, University of Waikato, Hamilton, New Zealand Email: [EMAIL PROTECTED]Fax 7 838 4155 Phone +64 7 838 4773 wk+64 7 849 6486 homeMobile 021 1395 862 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailma n/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Data import, going from 8 to 550 columns
You cannot use read.table() to read multi-line records as here. Why? It is not a table? You can use scan() to do this. On Sat, 24 Apr 2004, CG Pettersson wrote: Hello all! I need to import a NIR dataset into R. It should be quite trivial, but I can´t make it work. (No problems with the text in the beginning, as # is recognised by read.table as the comment sign.) The thing I can´t get around is the CR that ends every line after column eight as the line in R should be 550 columns wide (including the JF-number). Every new line in R should begin with the JF2455 and so on. Naturally it is possible to re-shape the tables in Excel before import, but it is quite tedious and doesn´t feel right...! How do I make read.table to just go on reading on the next line when it comes to CR, and how do I make it use the double CR followed by a blank to begin the next line? The data-file(s) looks like this: #ID=Samples from soil scanning #SAMPLE_NUMBERS_PRESENT=Y #NX_VARIABLES=550 #NY_VARIABLES=0 #FIRST_WAVELENGTH=1300.00 #LAST_WAVELENGTH=2398.00 #WAVELENGTH_INCREMENT=2.00 JF2455 0.4367495 0.4365539 0.4363573 0.4361560 0.4359702 0.4357788 0.4355963 0.4354126 0.4352311 0.4350726 0.4349101 0.4347557 0.4346097 0.4344587 0.4343193 0.4341759 0.4340320 0.4338984 0.4337671 0.4336369 0.4335097 0.4333864 the original table is 8 columns wide, ended with a CR sixty four lines removed here 0.5015950 0.5020472 0.5026294 0.5033303 0.5041344 0.5049909 0.5059010 0.5067372 0.5075415 0.5082389 0.5089509 0.5095288 0.5101137 0.5106306 0.5111954 0.5116805 JF2456 0.3604568 0.3600681 0.3596676 0.3592694 0.3588919 0.3585098 0.3581379 0.3577725 0.3573992 0.3570563 0.3566975 0.3563588 0.3560365 0.3556931 0.3553730 0.3550543 0.3547286 0.3544230 0.3541073 0.3537982 0.3535004 0.3531921 0.3529077 0.3526271 0.3523493 0.3520919 0.3518271 0.3515673 0.3513192 0.3510693 0.3508208 0.3505693 ... and so on Thanks! /CG CG Pettersson, MSci, PhD Stud. Swedish University of Agricultural Sciences Dep. of Ecology and Crop Production. Box 7043 SE-750 07 Uppsala __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Changing Gui preferences
At 06:05 24/04/2004, you wrote: I'm not sure how to find 'HOME' but I presume that it is the path that appears in the 'Change directory' box when you open this through the gui. In which case for me 'HOME' is C:\apps\R\rw1090 Now I have 3 versions of Rconsole near here. They are in C:\apps\R\rw1090\etc C:\apps\R\rw1090 and C:\apps\R I still don't know how to change my Gui preferences! Murray, I don´t use win XP, I use win 98 SE. In my system I know de 'HOME' with a simple procedure: 1- Look R shortcut 2- Right Click in R shortcut 3- Choose properties 4- Look satrt in box 5- de directorie in box is the 'HOME' Bernardo Rangel Tura, MD, MSc National Institute of Cardiology Laranjeiras Rio de Janeiro Brazil __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Changing Gui preferences
This *is* in the rw-FAQ. This answer is correct on Win 98, but not on WinXP (when the latter is not broken by a hotfix patch). On Sat, 24 Apr 2004, Bernardo Rangel Tura wrote: At 06:05 24/04/2004, you wrote: I'm not sure how to find 'HOME' but I presume that it is the path that appears in the 'Change directory' box when you open this through the gui. In which case for me 'HOME' is C:\apps\R\rw1090 Now I have 3 versions of Rconsole near here. They are in C:\apps\R\rw1090\etc C:\apps\R\rw1090 and C:\apps\R I still don't know how to change my Gui preferences! Murray, I don´t use win XP, I use win 98 SE. In my system I know de 'HOME' with a simple procedure: 1- Look R shortcut 2- Right Click in R shortcut 3- Choose properties 4- Look satrt in box 5- de directorie in box is the 'HOME' Bernardo Rangel Tura, MD, MSc National Institute of Cardiology Laranjeiras Rio de Janeiro Brazil -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Moving window regressions - how can I improve this code?
Ajay Shah [EMAIL PROTECTED] writes: I wrote a function which does moving window regressions. E.g. if there are 100 observations and the window width is 50, then I first run the regression for observations 1..50, then for 2..51, and so on. I am extremely pleased with R in my experience with writing this, since I was able to pass the model as an argument into the function :-) Forgive me if I sound naive, but that's rocket science to me!! For a regression with K explanatory variables, I make a matrix with 2*K+2 columns, where I capture: K coefficients and K standard errors the residual sigma R^2 My code is: # movingWindowRegression - function(data, T, width, model, K) { results = matrix(nrow=T, ncol=2*K+2) for (i in width:T) { details - summary.lm(lm(as.formula(model), data[(i-width+1):i,])) n=1; for (j in 1:K) { results[i, n] = details$coefficients[j, 1] results[i, n+1] = details$coefficients[j, 2] n = n + 2 } results[i, n] = details$sigma results[i, n+1] = details$r.squared } return(results) } # Simulate some data for a linear regression T = 20 x = runif(T); y = 2 + 3*x + rnorm(T); D = data.frame(x, y) r = movingWindowRegression(D, T=T, width=10, model=y ~ x, K=2) print(r) # I would be very happy if you could look at this and tell me how to do things better. First, thanks for posting the code. It takes courage to send your code to the list like this for public commentary. However, questions like this provide instructive examples. Some comments: As Gabor pointed out, there is a generic function nrow that can be applied to data frames. As a matter of style, it is better to use details - summary(lm(model, data[row range,])) That is, call the generic function summary, not the specific name of the method summary.lm. It is redundant to use summary.lm(lm(...)) and this construction is not guaranteed to continue to be valid. With regard to the looping, I would suggest using a list of summary objects as the basic data structure within your function, then extracting the pieces that you want from that list using lapply or sapply. That is, I would start with movingWindow - function(formula, data, width, ...) { nr = nrow(data) width = as.integer(width)[1] if (width = 0 || width = nr) stop(paste(width must be in the range 1,...,, nr, sep=)) nreg = nr - width base = 0:(width - 1) sumrys - lapply(1:nreg, function(st) { summary(lm(formula, data[base + st,])) }) sumrys } I changed the argument name model to formula and changed the order of the arguments to the standard order used in R modeling functions. You may not want to use this form for very large data sets because the raw summaries could be very large. However this is a place to start. The reason for creating a list is so you can use sapply and lapply to extract results from the list. To get the coefficients and standard errors from a summary, use the coef generic and the select columns from the result. For example, data(women) sumrys = movingWindow(weight ~ height, women, 9) unlist(lapply(sumrys, function(sm) sm$sigma)) # sigma values [1] 0.4714045 0.3248931 0.4481000 0.5613193 0.6042180 0.7627228 sapply(sumrys, [[, sigma) # same thing [1] 0.4714045 0.3248931 0.4481000 0.5613193 0.6042180 0.7627228 sapply(sumrys, function(sm) coef(sm)[,1:2]) [,1] [,2] [,3] [,4][,5] [1,] -59.7778 -67.1278 -73.4222 -81.9722 -91.778 [2,] 3. 3.1167 3.2167 3.3500 3.500 [3,] 3.77647036 2.64466036 3.70537766 4.71400546 5.1522157 [4,] 0.06085806 0.04194352 0.05784947 0.07246601 0.0780042 [,6] [1,] -106.1278 [2,]3.7167 [3,]6.60219186 [4,]0.09846709 The last part shows how to get the coefficients estimates and their standard errors as columns of a matrix. I think I would return the coefficients and standard errors separately, as in movingWindow - function(formula, data, width, ...) { nr = nrow(data) width = as.integer(width)[1] if (width = 0 || width = nr) stop(paste(width must be in the range 1,...,, nr, sep=)) nreg = nr - width base = 0:(width - 1) sumrys - lapply(1:nreg, function(st) { summary(lm(formula, data[base + st,])) }) list(coefficients = sapply(sumrys, function(sm) coef(sm)[,Estimate]), Std.Error = sapply(sumrys, function(sm) coef(sm)[,Std. Error]), sigma = sapply(sumrys, [[, sigma), r.squared =
Re: [R] Moving window regressions - how can I improve this code?
Douglas Bates produced an extremely useful post so I hesitate to add more but here goes anyways. 1. The bug I referred to in my previous post is still in the code. The last data point in women never gets used. I think this goes beyond just making a simple error but gets to the point of how to avoid index arithmetic in the first place due to its error prone nature and excess complexity.Using embed() or running() (the latter is in package gregmisc) would do that although at the expense of more memory. 2. I find using stopifnot convenient for testing assertions. There is also a bug in the assertion code shown (width should be allowed to equal nr) and again I think its due to the introduction of complexity. The problem is that its harder to get it right when you have to invert the assertion condition. I have recently discovered the R stopifnot function which allows one to state assertions without inverting them and now use it all the time. Once you use stopifnot the assertion becomes somewhat clearer since its not muddied by the inversion and in my mind it starts to bring out a wider range of considerations such as whether a width with zero or fewer degrees of freedom should be allowed. Perhaps the condition should be width length(all.vars(formula)) or if formulae such as y~x-1 are allowed then a more complex specification. With these changes (except I've kept width 0), movingWindow becomes movingWindow - function(formula, data, width, ...) { nr - nrow(data) width - as.integer(width)[1] stopifnot( width 0, width = nr ) indices - as.data.frame( t( embed( 1:nr, width ) ) ) lapply(indices, function(st) summary(lm(formula, data[st,])) ) } Similar comments could be applied to movingWindow2. Douglas Bates bates at stat.wisc.edu writes: : : Ajay Shah ajayshah at mayin.org writes: : : I wrote a function which does moving window regressions. E.g. if : there are 100 observations and the window width is 50, then I first : run the regression for observations 1..50, then for 2..51, and so on. : : I am extremely pleased with R in my experience with writing this, : since I was able to pass the model as an argument into the function : Forgive me if I sound naive, but that's rocket science to me!! : : For a regression with K explanatory variables, I make a matrix with : 2*K+2 columns, where I capture: : K coefficients and K standard errors : the residual sigma : R^2 : : My code is: : : # : movingWindowRegression - function(data, T, width, model, K) { : results = matrix(nrow=T, ncol=2*K+2) : for (i in width:T) { : details - summary.lm(lm(as.formula(model), data[(i-width+1):i,])) : n=1; : for (j in 1:K) { : results[i, n] = details$coefficients[j, 1] : results[i, n+1] = details$coefficients[j, 2] : n = n + 2 : } : results[i, n] = details$sigma : results[i, n+1] = details$r.squared : } : return(results) : } : : # Simulate some data for a linear regression : T = 20 : x = runif(T); y = 2 + 3*x + rnorm(T); : D = data.frame(x, y) : : r = movingWindowRegression(D, T=T, width=10, model=y ~ x, K=2) : print(r) : # : : I would be very happy if you could look at this and tell me how to do : things better. : : First, thanks for posting the code. It takes courage to send your : code to the list like this for public commentary. However, questions : like this provide instructive examples. : : Some comments: : : As Gabor pointed out, there is a generic function nrow that can be : applied to data frames. : : As a matter of style, it is better to use : details - summary(lm(model, data[row range,])) : That is, call the generic function summary, not the specific name : of the method summary.lm. It is redundant to use summary.lm(lm(...)) : and this construction is not guaranteed to continue to be valid. : : With regard to the looping, I would suggest using a list of summary : objects as the basic data structure within your function, then : extracting the pieces that you want from that list using lapply or sapply. : : That is, I would start with : : movingWindow - function(formula, data, width, ...) { : nr = nrow(data) : width = as.integer(width)[1] : if (width = 0 || width = nr) : stop(paste(width must be in the range 1,...,, nr, sep=)) : nreg = nr - width : base = 0:(width - 1) : sumrys - lapply(1:nreg, : function(st) { : summary(lm(formula, data[base + st,])) : }) : sumrys : } : : I changed the argument name model to formula and changed the : order of the arguments to the standard order used in R modeling : functions. : : You may not want to use this
[R] Modalwert
Hai - kann mir jemand sagen, wie ich den Modalwert in R berechne?! IRgendwie finde ich den Befehl nicht greetz und herzlichen Dank Sonja __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Modalwert
On 04/24/04 19:03, Sonja Dornieden wrote: Hai - kann mir jemand sagen, wie ich den Modalwert in R berechne?! IRgendwie finde ich den Befehl nicht Veilleicht which.max() -- Jonathan Baron, Professor of Psychology, University of Pennsylvania Home page:http://www.sas.upenn.edu/~baron R page: http://finzi.psych.upenn.edu/ __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Moving window regressions - how can I improve this code?
Gabor Grothendieck [EMAIL PROTECTED] writes: Douglas Bates produced an extremely useful post so I hesitate to add more but here goes anyways. 1. The bug I referred to in my previous post is still in the code. The last data point in women never gets used. I think this goes beyond just making a simple error but gets to the point of how to avoid index arithmetic in the first place due to its error prone nature and excess complexity.Using embed() or running() (the latter is in package gregmisc) would do that although at the expense of more memory. You're right. I miscounted and it is easy to do that when manipulating the indices explicitly. I wasn't aware of the embed function but, now that you have pointed it out, I agree that it should be used here. 2. I find using stopifnot convenient for testing assertions. There is also a bug in the assertion code shown (width should be allowed to equal nr) and again I think its due to the introduction of complexity. The problem is that its harder to get it right when you have to invert the assertion condition. I have recently discovered the R stopifnot function which allows one to state assertions without inverting them and now use it all the time. Once you use stopifnot the assertion becomes somewhat clearer since its not muddied by the inversion and in my mind it starts to bring out a wider range of considerations such as whether a width with zero or fewer degrees of freedom should be allowed. Perhaps the condition should be width length(all.vars(formula)) or if formulae such as y~x-1 are allowed then a more complex specification. I agree that stopifnot is the best construction for assertions. Thanks for pointing that out. It is one of the very useful utilities added by Martin Maechler, who also wrote the str function which gets my vote for the best debugging tool in R. If you don't know the total number of coefficients then you would need to assert only width 0. If you do know the number of coefficients then you can assert width = ncoef. With these changes (except I've kept width 0), movingWindow becomes movingWindow - function(formula, data, width, ...) { nr - nrow(data) width - as.integer(width)[1] stopifnot( width 0, width = nr ) indices - as.data.frame( t( embed( 1:nr, width ) ) ) lapply(indices, function(st) summary(lm(formula, data[st,])) ) } Similar comments could be applied to movingWindow2. Adopting your suggestions I would change the movingWindow2 code to movingWindow2 - function(formula, data, width, ...) { mCall = match.call() mCall$width = NULL mCall[[1]] = as.name(lm) mCall$x = mCall$y = TRUE bigfit = eval(mCall, parent.frame()) ncoef = length(coef(bigfit)) nr = nrow(data) width = as.integer(width)[1] stopifnot(width = ncoef, width = nr) y = bigfit$y x = bigfit$x terms = bigfit$terms inds = embed(seq(nr), width)[, rev(seq(width))] sumrys - lapply(seq(nrow(inds)), function(st) { ind = inds[st,] fit = lm.fit(x[ind,], y[ind]) fit$terms = terms class(fit) = lm summary(fit) }) list(coefficients = sapply(sumrys, function(sm) coef(sm)[,Estimate]), Std.Error = sapply(sumrys, function(sm) coef(sm)[,Std. Error]), sigma = sapply(sumrys, [[, sigma), r.squared = sapply(sumrys, [[, r.squared)) } for which the test gives data(women) movingWindow2(weight ~ height, women, 9) $coefficients [,1] [,2] [,3] [,4] [,5][,6] (Intercept) -59.8 -67.127778 -73.42 -81.97222 -91.8 -106.127778 height3.0 3.116667 3.216667 3.35000 3.53.716667 [,7] (Intercept) -122.96 height 3.97 $Std.Error [,1] [,2] [,3] [,4] [,5] [,6] (Intercept) 3.77647036 2.64466036 3.70537766 4.71400546 5.1522157 6.60219186 height 0.06085806 0.04194352 0.05784947 0.07246601 0.0780042 0.09846709 [,7] (Intercept) 7.7792788 height 0.1143188 $sigma [1] 0.4714045 0.3248931 0.4481000 0.5613193 0.6042180 0.7627228 0.8855094 $r.squared [1] 0.9971276 0.9987338 0.9977411 0.9967352 0.9965351 0.9951107 0.9942195 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Colour coding and point types in a plot
R 1.8.1, Windows 2000. I am trying to find the legend for color coding and point types in a plot, which probably are standard for everybody but myself. Thanks for the tip! Mihai Nica Jackson State University No good deed will ever remain unpunished [[alternative HTML version deleted]] __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] determinant via Lapack
Hi, What Lapack routine(s) should I use to calculate the determinant of a symmetric matrix of real numbers? I would appreciate any guidance on this issue. Thanks, Kosuke - Kosuke Imai Office: Corwin Hall 041 Assistant Professor Phone: 609-258-6601 (Direct) Department of PoliticsFax: 609-258-1110 (Department) Princeton University Email: [EMAIL PROTECTED] Princeton, NJ 08544-1012 http://www.princeton.edu/~kimai __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
dd [Was: Re: [R] Selection of cities sample]
On Friday 23 of April 2004 04:37, Matej Cepl wrote: I have a question, how to most properly select set of cities which would be as similar as possible in some particular variables with the City of Boston (which I use as my base line). Hi, how to weigh variables used in daisy function? After week spent with MASS, Crawley (2002), and Gordon (1999), I finished with this function (which is actually not a real function but just convenient packaging of one complex expression): function(x) { require(cluster) return(hclust(daisy( as.matrix(x), metric=euclidean, stand=TRUE), method=average) ) } When plotting this I got a huge tree (available in PDF on http:// www.ceplovi.cz/matej/tmp/mctree.pdf), which seems to be very helpful, because by selecting particular cluster I get my group of cities to use as a sample. Would anybody be so kind and comment on this code, please? Now, I would love to weigh some variables in a dataframe used for calculation (because I am more concerned with some variables more than with others, which should be included with lower weigh). In help(daisy) I found this: If 'nok' is the number of nonzero weights, the dissimilarity is multiplied by the factor '1/nok' and thus ranges between 0 and 1. Do I understand correctly that this allows weighing of non-interval (non-continuous) variables? If yes, how can weigh variables, which are interval (whole my table is from counts and two percent variables)? Thanks for any reply, Matej Cepl -- Matej Cepl, http://www.ceplovi.cz/matej GPG Finger: 89EF 4BC6 288A BF43 1BAB 25C3 E09F EF25 D964 84AC 138 Highland Ave. #10, Somerville, Ma 02143, (617) 623-1488 Just remember, brothers and sisters--their skins may be white, but their souls are just as black as ours! -- a black preacher __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] RE: [Rd] a simple suggestion for the next version of R windows
Hi, Andy: Thanks! I think it also depends on people's working hobby. What you suggested is a good way around it. But I'm still thinking since the R window already has the big blue R logo to identify itself, the word R console is really redundant and could be replaced by something more informative. Not nesessarly everytime you change to a new diretory, you need to change to a new identifier, but at least the every first .RData file you loaded. I don't know... Something along that line. It will be very helpful when you use Alt+Tab to move between windows, b/c all you see are R consoles... ...Tao Original Message Follows From: Liaw, Andy [EMAIL PROTECTED] To: 'Tao Shi' [EMAIL PROTECTED],[EMAIL PROTECTED] CC: [EMAIL PROTECTED] Subject: RE: [Rd] a simple suggestion for the next version of R windows Date: Sat, 24 Apr 2004 23:18:06 -0400 I don't see how this can work. I frequently run only one R session (also in SDI), but change working directories several times during the session, with or without loading the workspace files in those directories, depending on what I need to do. I don't think the Window title can change every time I do setwd(somewhereElse); load(.RData). One possibility that could probably help you is to change the R command prompt from to the current working directory plus . I believe this can be done with the taskCallbackManager(). Andy From: Tao Shi Is it possible to replace the word R Console on the title bar (is it what it's called? It's the blue area above menu bar) with the name of the work space file you're using or loaded, so people who are runing multple R sessions at same time can identify them immediately. I'm using 1.9.0 in SDI mode. Thanks, ...Tao _ Is your PC infected? Get a FREE online computer virus scan from McAfee(r) Security. http://clinic.mcafee.com/clinic/ibuy/campaign.asp?cid=3963 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-devel -- Notice: This e-mail message, together with any attachments,...{{dropped}} __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] determinant via Lapack
This is not an R question, is it? You'd do better by consulting the Lapack manual. Googling around turns up http://www.ma.utexas.edu/documentation/lapack/node134.html, which I think is enough for you. Andy From: Kosuke Imai Hi, What Lapack routine(s) should I use to calculate the determinant of a symmetric matrix of real numbers? I would appreciate any guidance on this issue. Thanks, Kosuke - Kosuke Imai Office: Corwin Hall 041 Assistant Professor Phone: 609-258-6601 (Direct) Department of PoliticsFax: 609-258-1110 (Department) Princeton University Email: [EMAIL PROTECTED] Princeton, NJ 08544-1012 http://www.princeton.edu/~kimai __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Notice: This e-mail message, together with any attachments,...{{dropped}} __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html