date:20040424

Re: [R] time zones in POSIXt

2004-04-24 Thread Gabor Grothendieck


I think this is a problem with difftime (which gets called when the
subtraction in your example is invoked).  The first few lines of 
difftime are:

function (time1, time2, tz = , units = c(auto, secs, mins, 
hours, days, weeks)) 
{
time1 - as.POSIXct(time1, tz = tz)
time2 - as.POSIXct(time2, tz = tz)

so any POSIXlt timezone gets clobbered with difftime's tz.   



Vadim Ogranovich vograno at evafunds.com writes:

: 
: Hi,
: 
: I have two data sources. One records time in PST time zone, the other in
: GMT. I want to compute the difference between the two, but don't see
: how. Here is an example where I compute time difference between
: identical times each (meant to be) relative to its time zone.
: 
:  as.POSIXlt(2000-05-10 10:15:00,  PST) -  as.POSIXlt(2000-05-10
: 10:15:00,  GMT)
: Time difference of 0 secs
: 
: I was expecting to see 8hrs (which is the time difference between London
: and San-Francisco). Why is it so and what is the correct way of doing
: it?
: 
: I use R-1.8.1 on RH-7.3.
: 
: Thanks,
: Vadim
:  
: 
:   [[alternative HTML version deleted]]
: 
: __
: R-help at stat.math.ethz.ch mailing list
: https://www.stat.math.ethz.ch/mailman/listinfo/r-help
: PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
: 
:

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

RE: [R] time zones in POSIXt

2004-04-24 Thread Prof Brian Ripley

On Fri, 23 Apr 2004, Vadim Ogranovich wrote:

 Thank you for the lead, Dirk! Indeed this works on my machine too:
 
  as.POSIXct(2000-05-10 10:15:00,  tz=PST8PDT) -
 as.POSIXct(2000-05-10 10:15:00,  tz=GMT)
 Time difference of 7 hours
 
 
 However when I replace POSIXct by POSIXlt it breaks (this looks like a
 bug to me):
 
  as.POSIXlt(2000-05-10 10:15:00,  tz=PST8PDT) -
 as.POSIXlt(2000-05-10 10:15:00,  tz=GMT)
 Time difference of 0 secs

No, it's not a bug.  POSIXlt times are just numeric representations of
broken-down times, and the tzone attribute (which is normally not present)
is just a reminder to you.  (It has not been checked, so this is avoid
nasty surprises if it is wrong.)  Maybe it was a bad idea to allow - for
POSIXlt times, but then we do expect users to read the documentation
(including the code if they are puzzled).

 Now a couple of new questions:
 * how could I learn about appropriate names for time zones? For example
 I was using PST whereas it seems I had to use either PST8 or
 PST8PDT. Why PST was not good? Is it documented anywhere?

Specific to your OS, so I expect it documents it somewhere.

 * there seems to be no difference betweeen GMT and BST on my machine
 though PST8 and PST8PDT are treated properly:
 
 # PST8 is not identical to PST8PDT
  ISOdatetime(2003, seq(12), 1, 10, 0, 0, tz=PST8) - ISOdatetime(2003,
 seq(12), 1, 10, 0, 0, tz=PST8PDT)
 Time differences of0,0,0,0, 3600, 3600, 3600, 3600,
 3600, 3600,0,0 secs
 
 # GMT0 is identical to BST
  ISOdatetime(2003, seq(12), 1, 10, 0, 0, tz=GMT0) - ISOdatetime(2003,
 seq(12), 1, 10, 0, 0, tz=BST)
 Time differences of 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 secs
 
 Why is such dichotomy?

Ask your OS designer - timezones are handled by your OS.  I would not 
consider BST to be a timezone as it is only defined for half the year.
Windiows thinks I am on `GMT Daylight Time' although that usage is 
otherwise unknown in this country.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Fatal Error: INVALID HOMEDRIVE

2004-04-24 Thread Prof Brian Ripley

On Fri, 23 Apr 2004, Brett Melbourne wrote:

 I also set HOMEPATH in my System control panel as you suggested, then
 rebooted ... the same error occurs. Windows XP appears to be over-riding the
 setting because no HOMEPATH appears in the environment even when it's set in
 the control panel thus.

That was my original hypothesis.

 But R starts fine with C:\Program Files\R\rw1090\bin\Rgui.exe HOMEPATH=\
 in the Target field of the icon.

Better to set HOME there

C:\Program Files\R\rw1090\bin\Rgui.exe HOME=C:\

in case HOMEDRIVE disappears too.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Moving window regressions - how can I improve this code?

2004-04-24 Thread Gabor Grothendieck

1. ?nrow
2. ?all.vars

Getting rid of loop was discussed on Mar 22 and there was some debate
on whether or not it was a good idea although it turned out there was
a bug in the code that only the loop free version brought out.  
See archives.

Ajay Shah ajayshah at mayin.org writes:

: 
: I wrote a function which does moving window regressions. E.g. if
: there are 100 observations and the window width is 50, then I first
: run the regression for observations 1..50, then for 2..51, and so on.
: 
: I am extremely pleased with R in my experience with writing this,
: since I was able to pass the model as an argument into the function
:  Forgive me if I sound naive, but that's rocket science to me!!
: 
: For a regression with K explanatory variables, I make a matrix with
: 2*K+2 columns, where I capture:
: K coefficients and K standard errors
: the residual sigma
: R^2
: 
: My code is:
: 
:#  
:movingWindowRegression - function(data, T, width, model, K) {
:  results = matrix(nrow=T, ncol=2*K+2)
:  for (i in width:T) {
:details - summary.lm(lm(as.formula(model), data[(i-width+1):i,]))
:n=1;
:for (j in 1:K) {
:  results[i, n]   = details$coefficients[j, 1]
:  results[i, n+1] = details$coefficients[j, 2]
:  n = n + 2
:}
:results[i, n] = details$sigma
:results[i, n+1] = details$r.squared
:  }
:  return(results)
:}
: 
:# Simulate some data for a linear regression
:T = 20
:x = runif(T); y = 2 + 3*x + rnorm(T);
:D = data.frame(x, y)
: 
:r = movingWindowRegression(D, T=T, width=10, model=y ~ x, K=2)
:print(r)
:#  
: 
: I would be very happy if you could look at this and tell me how to do
: things better.
: 
: I have two specific questions:
: 
:   1. I find it silly that I have to manually pass K and T into the
:  function. It would be so much nicer to have:
: 
: r = movingWindowRegression(D,  width=10, model=y ~ x)
:  instead of the existing
: r = movingWindowRegression(D, T=T, width=10, model=y ~ x, K=2)
: 
:  How can the function inspect the data frame D and learn the
:  number of rows?
: 
:  How can the function inspect the model specification string and
:  learn K, the number of explanatory variables?
: 
:   2. The R way consists of avoiding loops when the code is
:  vectorisable. I am using a loop to copy out from
:  details$coefficients into the columns of results[i,]. Is there a
:  better way?
:

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Changing Gui preferences

2004-04-24 Thread Prof Brian Ripley

On Sat, 24 Apr 2004, Murray Jorgensen wrote:

 I want to modify my Gui preferences to get rid of the MDI toolbar and 
 status bar. [ Under Windows XP and R 1.8.1 and 1.9.0.]
 
 When I uncheck the boxes and click apply I am told to save the 
 preferences and the changes will take effect when restarting R. I click 
 save and a box comes up to ask me to save Rconsole (I'm not sure where 
 this should be saved but I navigate my way to where the old Rconsole was 
 and save on top of it.

?Rconsole tells you where the copy in use is saved.

  Strangely I am not asked if I want to replace the old Rconsole.

That's programmable, but from this dialog box one would normally want to 
overwrite, no?

 Anyway when I re-start there are no changes and I am still cluttered up 
 by the MDI bars that I want to get rid of!

I think this might be another manifestation of that Microsoft bug, since 
the normal place to look is in the user's home directory.  You need to 
make sure the copy you saved is being used.  (I have just tested this, and 
it does still work.)

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] time zones in POSIXt

2004-04-24 Thread Gabor Grothendieck

Prof Brian Ripley ripley at stats.ox.ac.uk writes:

:  However when I replace POSIXct by POSIXlt it breaks (this looks like a
:  bug to me):
:  
:   as.POSIXlt(2000-05-10 10:15:00,  tz=PST8PDT) -
:  as.POSIXlt(2000-05-10 10:15:00,  tz=GMT)
:  Time difference of 0 secs
: 
: No, it's not a bug.  POSIXlt times are just numeric representations of
: broken-down times, and the tzone attribute (which is normally not present)
: is just a reminder to you.  (It has not been checked, so this is avoid
: nasty surprises if it is wrong.)  Maybe it was a bad idea to allow - for
: POSIXlt times, but then we do expect users to read the documentation
: (including the code if they are puzzled).

This this is by design then its a bug by virtue of being a design error.

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] anova(my.lme.model, type=marginal) in Pinheiro and Bates

2004-04-24 Thread Federico Calboli

Dear All,

I was recently discussing the use of 

anova(my.lme.model, type=marginal)

with a colleague. I would like to find where the subject is discussed in
Pinheiro and Bates 2000. I like the book because it is clear and
accessible even for people with limited formal training in Math and
Stats like me, but I failed to find the subject at hand in the index. 

I realize that reading the book cover to cover would do me a lot of
good, but I would find it a bit impractical at the moment... can anyone
please point out where in the book the use of marginal anova is
discussed (if at all)? 

Regards,

Federico Calboli


-- 



=

Federico C. F. Calboli

Dipartimento di Biologia
Via Selmi 3
40126 Bologna
Italy

tel (+39) 051 209 4187
fax (+39) 051 209 4286

f.calboli at ucl.ac.uk

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] Data import, going from 8 to 550 columns

2004-04-24 Thread CG Pettersson

Hello all!

I need to import a NIR dataset into R. It should be quite trivial, but
I 
can´t make it work. (No problems with the text in the beginning, as #
is 
recognised by read.table as the comment sign.)

The thing I can´t get around is the CR that ends every line after
column eight as the line in R should be 550 columns wide (including
the JF-number). 
Every new line in R should begin with the JF2455 and so on.
Naturally it is possible to re-shape the tables in Excel before
import, but it is quite tedious and doesn´t feel right...!

How do I make read.table to just go on reading on the next line when
it comes to CR, and how do I make it use the double CR followed by
a blank to begin the next line?

The data-file(s) looks like this:


#ID=Samples from soil scanning
#SAMPLE_NUMBERS_PRESENT=Y
#NX_VARIABLES=550
#NY_VARIABLES=0
#FIRST_WAVELENGTH=1300.00
#LAST_WAVELENGTH=2398.00
#WAVELENGTH_INCREMENT=2.00
JF2455  0.4367495 0.4365539 0.4363573 0.4361560 0.4359702 0.4357788 
0.4355963 0.4354126 0.4352311 0.4350726 0.4349101 0.4347557 0.4346097
0.4344587 
0.4343193 0.4341759 0.4340320 0.4338984 0.4337671 0.4336369 0.4335097
0.4333864 
  the original table is 8 columns wide, ended with a CR
  sixty four lines removed here

0.5015950 0.5020472 0.5026294 0.5033303 0.5041344 0.5049909 0.5059010
0.5067372 
0.5075415 0.5082389 0.5089509 0.5095288 0.5101137 0.5106306 0.5111954
0.5116805 
 
 JF2456  0.3604568 0.3600681 0.3596676 0.3592694 0.3588919 0.3585098 
0.3581379 0.3577725 0.3573992 0.3570563 0.3566975 0.3563588 0.3560365
0.3556931 
0.3553730 0.3550543 0.3547286 0.3544230 0.3541073 0.3537982 0.3535004
0.3531921 
0.3529077 0.3526271 0.3523493 0.3520919 0.3518271 0.3515673 0.3513192
0.3510693 
0.3508208 0.3505693 ...

and so on

Thanks!
/CG


CG Pettersson, MSci, PhD Stud.
Swedish University of Agricultural Sciences
Dep. of Ecology and Crop Production. Box 7043
SE-750 07 Uppsala

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

RE: [R] Changing Gui preferences

2004-04-24 Thread Henrik Bengtsson

Hi, a quick reply to help troubleshoot. The below (is minimal and)
works on both R v1.8.1 and R v1.9.0 on WinXP Pro:

From http://www.maths.lth.se/help/R/Rgui/:

By default all plot windows (The Figure 2 etc windows) opened in
Rgui are placed within the Rgui frame. Some people prefer this others
don't. An option is to let the main window and all plot windows live
separately. See screenshots to the right. 

This is done by selecting GUI preferences... in the Edit menu.
Then select SDI (instead of MDI) next to Single or multiple
windows. The click Save and accept the suggested pathname
(Rconsole) by clicking Save in the opened file dialog. The click
Cancel (yes, it is indeed ok). Restart R to get effectuate the
settings.

My R_USER and HOME as seen by R are:

 Sys.getenv(R_USER)
  R_USER 
C:\\Documents and Settings\\hb 
 Sys.getenv(HOME)
HOME 
   

Maybe it helps you on the way...

Henrik Bengtsson




 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Murray 
 Jorgensen
 Sent: den 24 april 2004 11:05
 To: Prof Brian Ripley; R-help
 Subject: Re: [R] Changing Gui preferences
 
 
 Prof Brian Ripley wrote:
 
  On Sat, 24 Apr 2004, Murray Jorgensen wrote:
  
  
 I want to modify my Gui preferences to get rid of the MDI 
 toolbar and
 status bar. [ Under Windows XP and R 1.8.1 and 1.9.0.]
 
 When I uncheck the boxes and click apply I am told to save the
 preferences and the changes will take effect when 
 restarting R. I click 
 save and a box comes up to ask me to save Rconsole (I'm 
 not sure where 
 this should be saved but I navigate my way to where the old 
 Rconsole was 
 and save on top of it.
  
  
  ?Rconsole tells you where the copy in use is saved.
 
 What it says is:
 
   There are system copies of these files in 'R_HOME\etc'. 
  Users can
   have personal copies of the files: these are looked for in the
   location given by the environment variable 'R_USER'. The
system
   files are read only if a corresponding personal file is 
 not found.
 
   If the environment variable 'R_USER' is not set, the R 
 system sets
   it to 'HOME' if that is set (stripping any trailing slash),
   otherwise to '{HOMEDRIVE}{HOMEPATH}' if 'HOMEDRIVE' is set
   otherwise to the working directory.
 
 Now under Control PanelSystemAdvanced I find a box listing 
 environmental variables. I find no R_HOME or R_USER listed there so 
 presumably they are not set.
 
 I'm not sure how to find 'HOME' but I presume that it is the 
 path that 
 appears in the 'Change directory' box when you open this through the

 gui. In which case for me 'HOME' is
 
 C:\apps\R\rw1090
 
 Now I have 3 versions of Rconsole near here. They are in
 
 C:\apps\R\rw1090\etc
 C:\apps\R\rw1090   and
 C:\apps\R
 
 I still don't know how to change my Gui preferences!
 
  [...]
 
 Murray
 
 -- 
 Dr Murray Jorgensen
http://www.stats.waikato.ac.nz/Staff/maj.html
 Department of Statistics, University of Waikato, Hamilton, New
Zealand
 Email: [EMAIL PROTECTED]Fax 7 838
4155
 Phone  +64 7 838 4773 wk+64 7 849 6486 homeMobile 021 1395
862
 
 __
 [EMAIL PROTECTED] mailing list 
 https://www.stat.math.ethz.ch/mailma n/listinfo/r-help
 PLEASE 
 do read the posting guide! 
 http://www.R-project.org/posting-guide.html
 


__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Data import, going from 8 to 550 columns

2004-04-24 Thread Prof Brian Ripley

You cannot use read.table() to read multi-line records as here.
Why?  It is not a table?

You can use scan() to do this.

On Sat, 24 Apr 2004, CG Pettersson wrote:

 Hello all!
 
 I need to import a NIR dataset into R. It should be quite trivial, but
 I 
 can´t make it work. (No problems with the text in the beginning, as #
 is 
 recognised by read.table as the comment sign.)
 
 The thing I can´t get around is the CR that ends every line after
 column eight as the line in R should be 550 columns wide (including
 the JF-number). 
 Every new line in R should begin with the JF2455 and so on.
 Naturally it is possible to re-shape the tables in Excel before
 import, but it is quite tedious and doesn´t feel right...!
 
 How do I make read.table to just go on reading on the next line when
 it comes to CR, and how do I make it use the double CR followed by
 a blank to begin the next line?
 
 The data-file(s) looks like this:
 
 
 #ID=Samples from soil scanning
 #SAMPLE_NUMBERS_PRESENT=Y
 #NX_VARIABLES=550
 #NY_VARIABLES=0
 #FIRST_WAVELENGTH=1300.00
 #LAST_WAVELENGTH=2398.00
 #WAVELENGTH_INCREMENT=2.00
 JF2455  0.4367495 0.4365539 0.4363573 0.4361560 0.4359702 0.4357788 
 0.4355963 0.4354126 0.4352311 0.4350726 0.4349101 0.4347557 0.4346097
 0.4344587 
 0.4343193 0.4341759 0.4340320 0.4338984 0.4337671 0.4336369 0.4335097
 0.4333864 
   the original table is 8 columns wide, ended with a CR
   sixty four lines removed here
 
 0.5015950 0.5020472 0.5026294 0.5033303 0.5041344 0.5049909 0.5059010
 0.5067372 
 0.5075415 0.5082389 0.5089509 0.5095288 0.5101137 0.5106306 0.5111954
 0.5116805 
  
  JF2456  0.3604568 0.3600681 0.3596676 0.3592694 0.3588919 0.3585098 
 0.3581379 0.3577725 0.3573992 0.3570563 0.3566975 0.3563588 0.3560365
 0.3556931 
 0.3553730 0.3550543 0.3547286 0.3544230 0.3541073 0.3537982 0.3535004
 0.3531921 
 0.3529077 0.3526271 0.3523493 0.3520919 0.3518271 0.3515673 0.3513192
 0.3510693 
 0.3508208 0.3505693 ...
 
 and so on
 
 Thanks!
 /CG
 
 
 CG Pettersson, MSci, PhD Stud.
 Swedish University of Agricultural Sciences
 Dep. of Ecology and Crop Production. Box 7043
 SE-750 07 Uppsala
 
 __
 [EMAIL PROTECTED] mailing list
 https://www.stat.math.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
 
 

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Changing Gui preferences

2004-04-24 Thread Bernardo Rangel Tura

At 06:05 24/04/2004, you wrote:

I'm not sure how to find 'HOME' but I presume that it is the path that appears in the 
'Change directory' box when you open this through the gui. In which case for me 
'HOME' is

C:\apps\R\rw1090

Now I have 3 versions of Rconsole near here. They are in

C:\apps\R\rw1090\etc
C:\apps\R\rw1090   and
C:\apps\R

I still don't know how to change my Gui preferences!

Murray,

I don´t use win XP, I use win 98 SE.
In my system I know de 'HOME' with a simple procedure:

1- Look R shortcut
2- Right Click in R shortcut
3- Choose properties
4- Look satrt in box
5- de directorie in box is the 'HOME'



Bernardo Rangel Tura, MD, MSc
National Institute of Cardiology Laranjeiras
Rio de Janeiro Brazil  
__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Changing Gui preferences

2004-04-24 Thread Prof Brian Ripley

This *is* in the rw-FAQ.  This answer is correct on Win 98, but not on 
WinXP (when the latter is not broken by a hotfix patch).

On Sat, 24 Apr 2004, Bernardo Rangel Tura wrote:

 At 06:05 24/04/2004, you wrote:
 
 I'm not sure how to find 'HOME' but I presume that it is the path that appears in 
 the 'Change directory' box when you open this through the gui. In which case for me 
 'HOME' is
 
 C:\apps\R\rw1090
 
 Now I have 3 versions of Rconsole near here. They are in
 
 C:\apps\R\rw1090\etc
 C:\apps\R\rw1090   and
 C:\apps\R
 
 I still don't know how to change my Gui preferences!
 
 Murray,
 
 I don´t use win XP, I use win 98 SE.
 In my system I know de 'HOME' with a simple procedure:
 
 1- Look R shortcut
 2- Right Click in R shortcut
 3- Choose properties
 4- Look satrt in box
 5- de directorie in box is the 'HOME'
 
 
 
 Bernardo Rangel Tura, MD, MSc
 National Institute of Cardiology Laranjeiras
 Rio de Janeiro Brazil  
 

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Moving window regressions - how can I improve this code?

2004-04-24 Thread Douglas Bates

Ajay Shah [EMAIL PROTECTED] writes:

 I wrote a function which does moving window regressions. E.g. if
 there are 100 observations and the window width is 50, then I first
 run the regression for observations 1..50, then for 2..51, and so on.
 
 I am extremely pleased with R in my experience with writing this,
 since I was able to pass the model as an argument into the function
 :-) Forgive me if I sound naive, but that's rocket science to me!!
 
 For a regression with K explanatory variables, I make a matrix with
 2*K+2 columns, where I capture:
 K coefficients and K standard errors
 the residual sigma
 R^2
 
 My code is:
 
#  
movingWindowRegression - function(data, T, width, model, K) {
  results = matrix(nrow=T, ncol=2*K+2)
  for (i in width:T) {
details - summary.lm(lm(as.formula(model), data[(i-width+1):i,]))
n=1;
for (j in 1:K) {
  results[i, n]   = details$coefficients[j, 1]
  results[i, n+1] = details$coefficients[j, 2]
  n = n + 2
}
results[i, n] = details$sigma
results[i, n+1] = details$r.squared
  }
  return(results)
}

# Simulate some data for a linear regression
T = 20
x = runif(T); y = 2 + 3*x + rnorm(T);
D = data.frame(x, y)

r = movingWindowRegression(D, T=T, width=10, model=y ~ x, K=2)
print(r)
#  
 
 I would be very happy if you could look at this and tell me how to do
 things better.

First, thanks for posting the code.  It takes courage to send your
code to the list like this for public commentary.  However, questions
like this provide instructive examples.

Some comments:

 As Gabor pointed out, there is a generic function nrow that can be
 applied to data frames.

 As a matter of style, it is better to use
details - summary(lm(model, data[row range,]))
 That is, call the generic function summary, not the specific name
 of the method summary.lm.  It is redundant to use summary.lm(lm(...))
 and this construction is not guaranteed to continue to be valid.

 With regard to the looping, I would suggest using a list of summary
 objects as the basic data structure within your function, then
 extracting the pieces that you want from that list using lapply or sapply.

 That is, I would start with 

movingWindow - function(formula, data, width, ...) {
nr = nrow(data)
width = as.integer(width)[1]
if (width = 0 || width = nr)
stop(paste(width must be in the range 1,...,, nr, sep=))
nreg = nr - width
base = 0:(width - 1)
sumrys - lapply(1:nreg,
 function(st) {
 summary(lm(formula, data[base + st,]))
 })
sumrys
}

  I changed the argument name model to formula and changed the
  order of the arguments to the standard order used in R modeling
  functions.
 
  You may not want to use this form for very large data sets because
  the raw summaries could be very large.  However this is a place to
  start.
  
  The reason for creating a list is so you can use sapply and lapply to
  extract results from the list.  To get the coefficients and standard
  errors from a summary, use the coef generic and the select columns from
  the result.  For example,

 data(women)
 sumrys = movingWindow(weight ~ height, women, 9)
 unlist(lapply(sumrys, function(sm) sm$sigma))  # sigma values
[1] 0.4714045 0.3248931 0.4481000 0.5613193 0.6042180 0.7627228
 sapply(sumrys, [[, sigma)  # same thing
[1] 0.4714045 0.3248931 0.4481000 0.5613193 0.6042180 0.7627228
 sapply(sumrys, function(sm) coef(sm)[,1:2])
 [,1] [,2] [,3] [,4][,5]
[1,] -59.7778 -67.1278 -73.4222 -81.9722 -91.778
[2,]   3.   3.1167   3.2167   3.3500   3.500
[3,]   3.77647036   2.64466036   3.70537766   4.71400546   5.1522157
[4,]   0.06085806   0.04194352   0.05784947   0.07246601   0.0780042
  [,6]
[1,] -106.1278
[2,]3.7167
[3,]6.60219186
[4,]0.09846709
 

  The last part shows how to get the coefficients estimates and their
  standard errors as columns of a matrix.  I think I would return the
  coefficients and standard errors separately, as in

movingWindow - function(formula, data, width, ...) {
nr = nrow(data)
width = as.integer(width)[1]
if (width = 0 || width = nr)
stop(paste(width must be in the range 1,...,, nr, sep=))
nreg = nr - width
base = 0:(width - 1)
sumrys - lapply(1:nreg,
 function(st) {
 summary(lm(formula, data[base + st,]))
 })
list(coefficients = sapply(sumrys, function(sm) coef(sm)[,Estimate]),
 Std.Error = sapply(sumrys, function(sm) coef(sm)[,Std. Error]),
 sigma = sapply(sumrys, [[, sigma),
 r.squared =

Re: [R] Moving window regressions - how can I improve this code?

2004-04-24 Thread Gabor Grothendieck


Douglas Bates produced an extremely useful post so I hesitate to 
add more but here goes anyways. 

1. The bug I referred to in my previous post is still in the code.  
The last data point in women never gets used.  I think this
goes beyond just making a simple error but gets to the point of
how to avoid index arithmetic in the first place due to its
error prone nature and excess complexity.Using embed()
or running() (the latter is in package gregmisc) would do that although
at the expense of more memory.

2. I find using stopifnot convenient for testing assertions.  There
is also a bug in the assertion code shown (width should be allowed 
to equal nr) and again I think its due to the introduction of complexity.  
The problem is that its harder to get it right when you have to invert 
the assertion condition.  I have recently discovered the R 
stopifnot function which allows one to state assertions without 
inverting them and now use it all the time.   Once you use stopifnot
the assertion becomes somewhat clearer since its not muddied by the
inversion and in my mind it starts to bring out a wider range of
considerations such as whether a width
with zero or fewer degrees of freedom should be allowed.  Perhaps
the condition should be width  length(all.vars(formula)) or if
formulae such as y~x-1 are allowed then a more complex specification.

With these changes (except I've kept width  0), movingWindow becomes

movingWindow - function(formula, data, width, ...) {
nr - nrow(data)
width - as.integer(width)[1]
stopifnot( width  0, width = nr )
indices - as.data.frame( t( embed( 1:nr, width ) ) )
lapply(indices, function(st) summary(lm(formula, data[st,])) )
}

Similar comments could be applied to movingWindow2.


Douglas Bates bates at stat.wisc.edu writes:

: 
: Ajay Shah ajayshah at mayin.org writes:
: 
:  I wrote a function which does moving window regressions. E.g. if
:  there are 100 observations and the window width is 50, then I first
:  run the regression for observations 1..50, then for 2..51, and so on.
:  
:  I am extremely pleased with R in my experience with writing this,
:  since I was able to pass the model as an argument into the function
:   Forgive me if I sound naive, but that's rocket science to me!!
:  
:  For a regression with K explanatory variables, I make a matrix with
:  2*K+2 columns, where I capture:
:  K coefficients and K standard errors
:  the residual sigma
:  R^2
:  
:  My code is:
:  
: #  
: movingWindowRegression - function(data, T, width, model, K) {
:   results = matrix(nrow=T, ncol=2*K+2)
:   for (i in width:T) {
: details - summary.lm(lm(as.formula(model), data[(i-width+1):i,]))
: n=1;
: for (j in 1:K) {
:   results[i, n]   = details$coefficients[j, 1]
:   results[i, n+1] = details$coefficients[j, 2]
:   n = n + 2
: }
: results[i, n] = details$sigma
: results[i, n+1] = details$r.squared
:   }
:   return(results)
: }
: 
: # Simulate some data for a linear regression
: T = 20
: x = runif(T); y = 2 + 3*x + rnorm(T);
: D = data.frame(x, y)
: 
: r = movingWindowRegression(D, T=T, width=10, model=y ~ x, K=2)
: print(r)
: #  
:  
:  I would be very happy if you could look at this and tell me how to do
:  things better.
: 
: First, thanks for posting the code.  It takes courage to send your
: code to the list like this for public commentary.  However, questions
: like this provide instructive examples.
: 
: Some comments:
: 
:  As Gabor pointed out, there is a generic function nrow that can be
:  applied to data frames.
: 
:  As a matter of style, it is better to use
: details - summary(lm(model, data[row range,]))
:  That is, call the generic function summary, not the specific name
:  of the method summary.lm.  It is redundant to use summary.lm(lm(...))
:  and this construction is not guaranteed to continue to be valid.
: 
:  With regard to the looping, I would suggest using a list of summary
:  objects as the basic data structure within your function, then
:  extracting the pieces that you want from that list using lapply or sapply.
: 
:  That is, I would start with 
: 
: movingWindow - function(formula, data, width, ...) {
: nr = nrow(data)
: width = as.integer(width)[1]
: if (width = 0 || width = nr)
: stop(paste(width must be in the range 1,...,, nr, sep=))
: nreg = nr - width
: base = 0:(width - 1)
: sumrys - lapply(1:nreg,
:  function(st) {
:  summary(lm(formula, data[base + st,]))
:  })
: sumrys
: }
: 
:   I changed the argument name model to formula and changed the
:   order of the arguments to the standard order used in R modeling
:   functions.
: 
:   You may not want to use this

[R] Modalwert

2004-04-24 Thread Sonja Dornieden

Hai -
kann mir jemand sagen, wie ich den Modalwert in R berechne?! IRgendwie finde
ich den Befehl nicht
greetz und herzlichen Dank
Sonja

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Modalwert

2004-04-24 Thread Jonathan Baron

On 04/24/04 19:03, Sonja Dornieden wrote:
Hai -
kann mir jemand sagen, wie ich den Modalwert in R berechne?! IRgendwie finde
ich den Befehl nicht

Veilleicht
which.max()

-- 
Jonathan Baron, Professor of Psychology, University of Pennsylvania
Home page:http://www.sas.upenn.edu/~baron
R page:   http://finzi.psych.upenn.edu/

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Moving window regressions - how can I improve this code?

2004-04-24 Thread Douglas Bates

Gabor Grothendieck [EMAIL PROTECTED] writes:

 Douglas Bates produced an extremely useful post so I hesitate to 
 add more but here goes anyways. 
 
 1. The bug I referred to in my previous post is still in the code.  
 The last data point in women never gets used.  I think this
 goes beyond just making a simple error but gets to the point of
 how to avoid index arithmetic in the first place due to its
 error prone nature and excess complexity.Using embed()
 or running() (the latter is in package gregmisc) would do that although
 at the expense of more memory.

You're right.  I miscounted and it is easy to do that when
manipulating the indices explicitly.  I wasn't aware of the embed
function but, now that you have pointed it out, I agree that it should
be used here.

 2. I find using stopifnot convenient for testing assertions.  There
 is also a bug in the assertion code shown (width should be allowed 
 to equal nr) and again I think its due to the introduction of complexity.  
 The problem is that its harder to get it right when you have to invert 
 the assertion condition.  I have recently discovered the R 
 stopifnot function which allows one to state assertions without 
 inverting them and now use it all the time.   Once you use stopifnot
 the assertion becomes somewhat clearer since its not muddied by the
 inversion and in my mind it starts to bring out a wider range of
 considerations such as whether a width
 with zero or fewer degrees of freedom should be allowed.  Perhaps
 the condition should be width  length(all.vars(formula)) or if
 formulae such as y~x-1 are allowed then a more complex specification.

I agree that stopifnot is the best construction for assertions.
Thanks for pointing that out.  It is one of the very useful utilities
added by Martin Maechler, who also wrote the str function which gets
my vote for the best debugging tool in R.

If you don't know the total number of coefficients then you would need
to assert only  width  0.  If you do know the number of coefficients
then you can assert width = ncoef.

 With these changes (except I've kept width  0), movingWindow becomes
 
 movingWindow - function(formula, data, width, ...) {
 nr - nrow(data)
 width - as.integer(width)[1]
 stopifnot( width  0, width = nr )
 indices - as.data.frame( t( embed( 1:nr, width ) ) )
 lapply(indices, function(st) summary(lm(formula, data[st,])) )
 }
 
 Similar comments could be applied to movingWindow2.

Adopting your suggestions I would change the movingWindow2 code to

movingWindow2 - function(formula, data, width, ...) {
mCall = match.call()
mCall$width = NULL
mCall[[1]] = as.name(lm)
mCall$x = mCall$y = TRUE
bigfit = eval(mCall, parent.frame())
ncoef = length(coef(bigfit))
nr = nrow(data)
width = as.integer(width)[1]
stopifnot(width = ncoef, width = nr)
y = bigfit$y
x = bigfit$x
terms = bigfit$terms
inds = embed(seq(nr), width)[, rev(seq(width))]
sumrys - lapply(seq(nrow(inds)),
 function(st) {
 ind = inds[st,]
 fit = lm.fit(x[ind,], y[ind])
 fit$terms = terms
 class(fit) = lm
 summary(fit)
 })
list(coefficients = sapply(sumrys, function(sm) coef(sm)[,Estimate]),
 Std.Error = sapply(sumrys, function(sm) coef(sm)[,Std. Error]),
 sigma = sapply(sumrys, [[, sigma),
 r.squared = sapply(sumrys, [[, r.squared))
}

for which the test gives

 data(women)
 movingWindow2(weight ~ height, women, 9)
$coefficients
 [,1]   [,2]   [,3]  [,4]  [,5][,6]
(Intercept) -59.8 -67.127778 -73.42 -81.97222 -91.8 -106.127778
height3.0   3.116667   3.216667   3.35000   3.53.716667
   [,7]
(Intercept) -122.96
height 3.97

$Std.Error
  [,1]   [,2]   [,3]   [,4]  [,5]   [,6]
(Intercept) 3.77647036 2.64466036 3.70537766 4.71400546 5.1522157 6.60219186
height  0.06085806 0.04194352 0.05784947 0.07246601 0.0780042 0.09846709
 [,7]
(Intercept) 7.7792788
height  0.1143188

$sigma
[1] 0.4714045 0.3248931 0.4481000 0.5613193 0.6042180 0.7627228 0.8855094

$r.squared
[1] 0.9971276 0.9987338 0.9977411 0.9967352 0.9965351 0.9951107 0.9942195

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] Colour coding and point types in a plot

2004-04-24 Thread Mihai Nica

R 1.8.1, Windows 2000.

I am trying to find the legend for color coding and point types in a plot, which 
probably are standard for everybody but myself. Thanks for the tip!

Mihai Nica
Jackson State University

No good deed will ever remain unpunished
[[alternative HTML version deleted]]

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] determinant via Lapack

2004-04-24 Thread Kosuke Imai

Hi,
  What Lapack routine(s) should I use to calculate the determinant of a
symmetric matrix of real numbers? I would appreciate any guidance on this
issue.
Thanks,
Kosuke

-
Kosuke Imai   Office: Corwin Hall 041
Assistant Professor   Phone: 609-258-6601 (Direct)
Department of PoliticsFax: 609-258-1110 (Department)
Princeton University  Email: [EMAIL PROTECTED]
Princeton, NJ 08544-1012  http://www.princeton.edu/~kimai

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

dd [Was: Re: [R] Selection of cities sample]

2004-04-24 Thread Matej Cepl

On Friday 23 of April 2004 04:37, Matej Cepl wrote:
 I have a question, how to most properly select set of cities 
 which would be as similar as possible in some particular 
 variables with the City of Boston (which I use as my base
 line).  

Hi,

how to weigh variables used in daisy function? After week spent 
with MASS, Crawley (2002), and Gordon (1999), I finished with 
this function (which is actually not a real function but just 
convenient packaging of one complex expression):

function(x) {
   require(cluster)
   return(hclust(daisy(
  as.matrix(x),
  metric=euclidean,
  stand=TRUE),
  method=average)
   )
}

When plotting this I got a huge tree (available in PDF on http://
www.ceplovi.cz/matej/tmp/mctree.pdf), which seems to be very 
helpful, because by selecting particular cluster I get my group 
of cities to use as a sample. Would anybody be so kind and 
comment on this code, please?

Now, I would love to weigh some variables in a dataframe used for 
calculation (because I am more concerned with some variables 
more than with others, which should be included with lower 
weigh). In help(daisy) I found this:

If 'nok' is the number of nonzero weights, the
dissimilarity is multiplied by the factor '1/nok' and thus
ranges between 0 and 1.

Do I understand correctly that this allows weighing of 
non-interval (non-continuous) variables? If yes, how can weigh 
variables, which are interval (whole my table is from counts and 
two percent variables)?

Thanks for any reply,

Matej Cepl

-- 
Matej Cepl, http://www.ceplovi.cz/matej
GPG Finger: 89EF 4BC6 288A BF43 1BAB  25C3 E09F EF25 D964 84AC
138 Highland Ave. #10, Somerville, Ma 02143, (617) 623-1488
 
Just remember, brothers and sisters--their skins may be white,
but their souls are just as black as ours!
   -- a black preacher

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] RE: [Rd] a simple suggestion for the next version of R windows

2004-04-24 Thread Tao Shi

Hi, Andy:

Thanks!  I think it also depends on people's working hobby.

What you suggested is a good way around it.  But I'm still thinking since 
the R window already has the big blue R logo to identify itself, the word R 
console is really redundant and could be replaced by something more 
informative.  Not nesessarly everytime you change to a new diretory, you 
need to change to a new identifier, but at least the every first .RData file 
you loaded.  I don't know...  Something along that line.  It will be 
very helpful when you use Alt+Tab to move between windows, b/c all you see 
are R consoles...

...Tao

Original Message Follows
From: Liaw, Andy [EMAIL PROTECTED]
To: 'Tao Shi' [EMAIL PROTECTED],[EMAIL PROTECTED]
CC: [EMAIL PROTECTED]
Subject: RE: [Rd] a simple suggestion for the next version of R windows
Date: Sat, 24 Apr 2004 23:18:06 -0400
I don't see how this can work.  I frequently run only one R session (also in
SDI), but change working directories several times during the session, with
or without loading the workspace files in those directories, depending on
what I need to do.  I don't think the Window title can change every time I
do setwd(somewhereElse); load(.RData).
One possibility that could probably help you is to change the R command
prompt from   to the current working directory plus  .  I believe this
can be done with the taskCallbackManager().
Andy

 From: Tao Shi

 Is it possible to replace the word R Console on the title
 bar (is it what
 it's called? It's the blue area above menu bar) with the name
 of the work
 space file you're using or loaded, so people who are runing multple R
 sessions at same time can identify them immediately.  I'm
 using 1.9.0 in SDI
 mode.

 Thanks,

 ...Tao

 _
 Is your PC infected? Get a FREE online computer virus scan
 from McAfee(r)
 Security. http://clinic.mcafee.com/clinic/ibuy/campaign.asp?cid=3963

 __
 [EMAIL PROTECTED] mailing list
 https://www.stat.math.ethz.ch/mailman/listinfo/r-devel


--
Notice:  This e-mail message, together with any attachments,...{{dropped}}
__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

RE: [R] determinant via Lapack

2004-04-24 Thread Liaw, Andy

This is not an R question, is it?  You'd do better by consulting the Lapack
manual.

Googling around turns up
http://www.ma.utexas.edu/documentation/lapack/node134.html, which I think is
enough for you.

Andy

 From: Kosuke Imai
 
 Hi,
   What Lapack routine(s) should I use to calculate the 
 determinant of a
 symmetric matrix of real numbers? I would appreciate any 
 guidance on this
 issue.
 Thanks,
 Kosuke
 
 -
 Kosuke Imai   Office: Corwin Hall 041
 Assistant Professor   Phone: 609-258-6601 (Direct)
 Department of PoliticsFax: 609-258-1110 (Department)
 Princeton University  Email: [EMAIL PROTECTED]
 Princeton, NJ 08544-1012  http://www.princeton.edu/~kimai
 
 __
 [EMAIL PROTECTED] mailing list
 https://www.stat.math.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html
 
 


--
Notice:  This e-mail message, together with any attachments,...{{dropped}}

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] time zones in POSIXt

RE: [R] time zones in POSIXt

Re: [R] Fatal Error: INVALID HOMEDRIVE

Re: [R] Moving window regressions - how can I improve this code?

Re: [R] Changing Gui preferences

Re: [R] time zones in POSIXt

[R] anova(my.lme.model, type=marginal) in Pinheiro and Bates

[R] Data import, going from 8 to 550 columns

RE: [R] Changing Gui preferences

Re: [R] Data import, going from 8 to 550 columns

Re: [R] Changing Gui preferences

Re: [R] Changing Gui preferences

Re: [R] Moving window regressions - how can I improve this code?

Re: [R] Moving window regressions - how can I improve this code?

[R] Modalwert

Re: [R] Modalwert

Re: [R] Moving window regressions - how can I improve this code?

[R] Colour coding and point types in a plot

[R] determinant via Lapack

dd [Was: Re: [R] Selection of cities sample]

[R] RE: [Rd] a simple suggestion for the next version of R windows

RE: [R] determinant via Lapack

22 matches

Site Navigation

Mail list logo

Footer information