[R] strange dlply behavior

2009-07-21 Thread Damien Moore
I'm running R 2.9.1 on winXP, using the library plyr.

Can anyone explain to me what is going wrong in this code? (in particular
see lines marked with **) Trying to modify objects in a list
created using dlply seems to corrupt the objects in the list.

 library(plyr)
 d=as.data.frame(cbind(c(1,1,1,2,2,2),c(1,2,3,4,5,6)))
 d
  V1 V2
1  1  1
2  1  2
3  1  3
4  2  4
5  2  5
6  2  6
 c=dlply(d,.(V1))
 c
[[1]]
  V1 V2
1  1  1
2  1  2
3  1  3

[[2]]
  V1 V2
4  2  4
5  2  5
6  2  6

## display an element from the second data frame
 c[[2]][2,2]
[1] 5

## change element in the second data from
 c[[2]][2,2]=10
 c
[[1]]
V1 V2
21  2**
2.1  1  2   **  What happened to V2?
2.2  1  2   **

[[2]]
   V1 V2
4   2  4
NA NA NA **
6   2  6

##Try again with first data frame
 c=dlply(d,.(V1))
 c[[1]][2,2]=10 **
 c
[[1]]
NULL * YIKES!


##Try again but copy c into a new list k
 c=dlply(d,.(V1))
 k=list(c[[1]],c[[2]])
 k[[1]]
  V1 V2
1  1  1
2  1  2
3  1  3
 k[[2]][2,2]=10
 k
[[1]]
  V1 V2
1  1  1
2  1  2
3  1  3

[[2]]
  V1 V2
4  2  4
5  2 10 ***
6  2  6
 k[[1]][2,2]=10
 k
[[1]]
  V1 V2
1  1  1
2  1 10 ***
3  1  3

[[2]]
  V1 V2
4  2  4
5  2 10
6  2  6

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plm Issues

2009-07-13 Thread Damien Moore
Duh! Thanks and good advice. I was using 2.7.2 because it was, until
recently, the latest version working with RPy (http://rpy.sourceforge.net/).
Also didn't realize plm was still actively developed.

Interesting that since plm now correctly handles diff and lag operations, it
actually breaks with the behavior of lm:

 a=ts(c(1,2,4))
 lm(a~diff(a))
Error in model.frame.default(formula = a ~ diff(a), drop.unused.levels =
TRUE) :
  variable lengths differ (found for 'diff(a)')

To regress a on its difference, one needs the more laborious:
 a=ts(c(1,2,4))
 adata=as.data.frame(cbind(a,diff(a)))
 colnames(adata)=c(a,diffa)
 lm(a~diffa,data=adata)

Call:
lm(formula = a ~ diffa, data = adata)

Coefficients:
(Intercept)diffa
  02

From the R help
Fitting Linear ModelsUsing time series

Considerable care is needed when using lm with time series.

Unless na.action = NULL, the time series attributes are stripped from the
variables before the regression is done. (This is necessary as omitting NAs
would invalidate the time series attributes, and if NAs are omitted in the
middle of the series the result would no longer be a regular time series.)

Even if the time series attributes are retained, they are not used to line
up series, so that the time shift of a lagged or differenced regressor would
be ignored. It is good practice to prepare a data argument by
ts.intersectts.union.html(...,
dframe = TRUE), then apply a suitable na.action to that data frame and call
lm with na.action = NULL so that residuals and fitted values are time
series.


On Sat, Jul 11, 2009 at 10:53 PM, milton ruser milton.ru...@gmail.comwrote:

 The first think one need to do when has a so old version, is update it :-)
 After, if the problem remain, try get help with the colleagues.

 best

 milton

 On Thu, Jul 9, 2009 at 10:58 AM, Damien Moore damienlmo...@gmail.comwrote:

 Hi List

 I'm having difficulty understanding how plm should work with dynamic
 formulas. See the commands and output below on a standard data set. Notice
 that the first summary(plm(...)) call returns the same result as the
 second
 (it shouldn't if it actually uses the lagged variable requested). The
 third
 call results in error (trying to use diff'ed variable in regression)

 Other info: I'm running R 2.7.2 on WinXP

 cheers



 *data(Gasoline,package=Ecdat)
 Gasoline_plm-plm.data(Gasoline,c(country,year))
 pdim(Gasoline_plm)
 **Balanced Panel: n=18, T=19, N=342
 *
 *summary(plm(lgaspcar~lincomep,data=Gasoline_plm**))
 **Oneway (individual) effect Within Model

 Call:
 plm(formula = lgaspcar ~ lincomep, data = Gasoline_plm)

 Balanced Panel: n=18, T=19, N=342

 Residuals :
Min.  1st Qu.   Median  3rd Qu. Max.
 -0.40100 -0.08410 -0.00858  0.08770  0.73400

 Coefficients :
 Estimate Std. Error t-value  Pr(|t|)
 lincomep -0.761830.03535 -21.551  2.2e-16 ***
 ---
 Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

 Total Sum of Squares: 17.061
 Residual Sum of Squares: 6.9981
 Multiple R-Squared: 0.58981
 F-statistic: 464.442 on 323 and 1 DF, p-value: 0.036981

 ** summary(plm(lgaspcar~lag(lincomep),data=Gasoline_plm))
 **Oneway (individual) effect Within Model

 Call:
 plm(formula = lgaspcar ~ lag(lincomep), data = Gasoline_plm)

 Balanced Panel: n=18, T=19, N=342

 Residuals :
Min.  1st Qu.   Median  3rd Qu. Max.
 -0.40100 -0.08410 -0.00858  0.08770  0.73400

 Coefficients :
  Estimate Std. Error t-value  Pr(|t|)
 lag(lincomep) -0.761830.03535 -21.551  2.2e-16 ***
 ---
 Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

 Total Sum of Squares: 17.061
 Residual Sum of Squares: 6.9981
 Multiple R-Squared: 0.58981
 F-statistic: 464.442 on 323 and 1 DF, p-value: 0.036981

 *
 *summary(plm(lgaspcar~diff(lincomep),data=Gasoline_plm))*
 *Error in model.frame.default(formula = lgaspcar ~ diff(lincomep), data =
 mydata,  :
  variable lengths differ (found for 'diff(lincomep)')
 *

[[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] plm Issues

2009-07-09 Thread Damien Moore
Hi List

I'm having difficulty understanding how plm should work with dynamic
formulas. See the commands and output below on a standard data set. Notice
that the first summary(plm(...)) call returns the same result as the second
(it shouldn't if it actually uses the lagged variable requested). The third
call results in error (trying to use diff'ed variable in regression)

Other info: I'm running R 2.7.2 on WinXP

cheers



*data(Gasoline,package=Ecdat)
Gasoline_plm-plm.data(Gasoline,c(country,year))
pdim(Gasoline_plm)
**Balanced Panel: n=18, T=19, N=342
*
*summary(plm(lgaspcar~lincomep,data=Gasoline_plm**))
**Oneway (individual) effect Within Model

Call:
plm(formula = lgaspcar ~ lincomep, data = Gasoline_plm)

Balanced Panel: n=18, T=19, N=342

Residuals :
Min.  1st Qu.   Median  3rd Qu. Max.
-0.40100 -0.08410 -0.00858  0.08770  0.73400

Coefficients :
 Estimate Std. Error t-value  Pr(|t|)
lincomep -0.761830.03535 -21.551  2.2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Total Sum of Squares: 17.061
Residual Sum of Squares: 6.9981
Multiple R-Squared: 0.58981
F-statistic: 464.442 on 323 and 1 DF, p-value: 0.036981

** summary(plm(lgaspcar~lag(lincomep),data=Gasoline_plm))
**Oneway (individual) effect Within Model

Call:
plm(formula = lgaspcar ~ lag(lincomep), data = Gasoline_plm)

Balanced Panel: n=18, T=19, N=342

Residuals :
Min.  1st Qu.   Median  3rd Qu. Max.
-0.40100 -0.08410 -0.00858  0.08770  0.73400

Coefficients :
  Estimate Std. Error t-value  Pr(|t|)
lag(lincomep) -0.761830.03535 -21.551  2.2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Total Sum of Squares: 17.061
Residual Sum of Squares: 6.9981
Multiple R-Squared: 0.58981
F-statistic: 464.442 on 323 and 1 DF, p-value: 0.036981

*
*summary(plm(lgaspcar~diff(lincomep),data=Gasoline_plm))*
*Error in model.frame.default(formula = lgaspcar ~ diff(lincomep), data =
mydata,  :
  variable lengths differ (found for 'diff(lincomep)')
*

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.