Re: [R] Error message cs_lu(A) failed: near-singular A (or out of memory)

2012-12-25 Thread Arne Henningsen
Dear Rui

If you impose the homogeneity (adding-up) restrictions, your system
becomes singular, because the error terms of the share equations
always sum up to zero. Therefore, you can arbitrarily delete one of
the share equations and obtain the coefficients that were not directly
estimated by the homogeneity restrictions. Furthermore, you can impose
the homogeneity restriction at each single equation by normalization
with one input price (numéraire). Finally, I suggest to impose the
cross-equation restrictions by the argument restrict.regMat rather
than by argument restrict.matrix, because the documentation says
the advantage of imposing restrictions on the coefficients by
'restrict.regMat' is that the matrix, which has to be inverted during
the estimation, gets smaller by this procedure, while it gets larger
if the restrictions are imposed by 'restrict.matrix' and
'restrict.rhs'. I will send you my lecture notes on econometric
production analysis with R by private mail. Please do not forget to
cite R and the R packages that you use in your publications. If you
have further questions regarding system estimation, microeconomic
analysis, or stochastic frontier (efficiency) analysis with R, you can
use a forum at the R-Forge sites of the systemfit [1,2], micEcon
[3,4], or frontier [5] packages/projects, respectively.

[1] http://www.systemfit.org/
[2] http://r-forge.r-project.org/projects/systemfit/
[3] http://www.micEcon.org/
[4] http://r-forge.r-project.org/projects/micecon/
[5] http://r-forge.r-project.org/projects/frontier/

Best (holiday) wishes from Copenhagen,
Arne

On 9 December 2012 23:31, Rui Neiva ruiqne...@gmail.com wrote:
 Hi there everyone,

 I have the following model (this is naturally a simplified version just for
 showing my problem, in case you're wondering this is a translog cost
 function with the associated cost share equations):

 C ~ á + â1 log X + â2 log Y + ã1 log Z + ã2 log XX
 C1 ~ â1 + â2 log YY + ã1 log ZZ

 Then I have some restrictions on the coefficients, namely that the sum of â
 equal 1 and the some of ã equal zero
 So, I've added the following equations to the model

 C2 ~ 0 +  â1.Y1 + â2.Y2
 C3 ~ 0 +  ã1.Y3 + ã2. Y4

 I've created columns in my data frame with values of 0 for variable C3 and
 values of 1 for Y1, Y2, Y3, Y4 and C2

 I'm using the systemfit package to solve a multiple equation system using
 the SURE method, and using a matrix to impose the restrictions on the
 coefficients (i.e., that the â1 in all equations is the same value, and the
 same for all the other coefficients).

 When I try to run the model without the restricting equations (C2, C3) it
 runs just fine, but when I add these two equations I get the error:

 Error in solve(xtOmegaInv %*% xMat2, xtOmegaInv %*% yVec, tol = solvetol)
 :
   cs_lu(A) failed: near-singular A (or out of memory)

 Any ideas on what the problem might be?

 All the best,
 Rui Neiva

 P.S.: I've also posted this question on the Matrix help forum, but since I
 do not know how active that forum is I've decided to see if anyone in the
 mailing list would be able to help.

 [[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Arne Henningsen
http://www.arne-henningsen.name

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] apply with missing values

2012-12-25 Thread Jessica Streicher
A bit of data might be useful. Make a small example and post the data with 
dput().

On 24.12.2012, at 20:21, jenna wrote:

 I am trying to get the means of each participants avg saccade amplitude as a
 function of the group they were in (designated by shape_), but there are
 missing values in the datasetthis is what i tried...
 
 with(data055,tapply(AVERAGE_SACCADE_AMPLITUDE,shape_,mean))
 i get an error saying the argument is not numerical or logical
 
 next i try...
 
 055  - tapply(data055$AVERAGE_SACCADE_AMPLITUDE,
 + data055$shape_,mean, na.rm =TRUE)
 
 still nothing
 
 help? thanks!
 
 
 
 --
 View this message in context: 
 http://r.789695.n4.nabble.com/apply-with-missing-values-tp4653889.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] as.Date to as.POSIXct

2012-12-25 Thread Antonio Rodriges
Hello,

I have converted some UNIX time stamps with as.Date as follows

dates_unix - seq(
 as.Date(convertTimeToString(timeStart)),
 length = length(data),
 by=1 mon)

and now I would like to convert dates_unix from type Date to type POSIXct

How can I do that?


Thanks

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R and Matlab

2012-12-25 Thread Suzen, Mehmet
Simplest way is the call a system command,  using R CMD.
See :http://stackoverflow.com/questions/6695105/call-r-scripts-in-matlab

But there are more complicated solutions are proposed:

http://www.mathworks.co.uk/matlabcentral/fileexchange/5051
This is uses R-(D)-COM

In my opinion most robust integration can be achived via Java code that runs R.
So you can easily use that in Matlab.

On 24 December 2012 20:42, Amirehsan Ranginkaman
aeranginka...@gmail.com wrote:
 Hi,

 How can I call R functions or Package in MatLab?

 Is there any way?


 Thanks
 Regards
 Ranginkaman

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Sampling data without having infinite numbers after diong a transformation

2012-12-25 Thread Agnes Ayang
Hello R-helpers..

I want to ask about how I can sample data sets without having the infinite 
numbers coming out. For example,

set.seed(1234)

a-rnorm(15,0,1)
b-rnorm(15,0,1)
c-rnorm(15,0,1)
d-rnorm(15,0,36)

After come out with the sample, I need to do a transformation  (by Hoaglin, 
1985) for each data set. Actually I need to measure the skewness and kurtosis, 
that's why I need to do the transformation. After transformation, there will be 
'Inf' value in my data sets and I cannot proceed with the next step where I 
need to compute the trimmed mean and sum square of deviation.

If anyone can help on how to obtain a better data sets so that my programme 
will work. Thank you.

Best regards,
Hyo Min
UPM Malaysia

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] aggregate / collapse big data frame efficiently

2012-12-25 Thread Martin Batholdy
Hi,


I need to aggregate rows of a data.frame by computing the mean for rows with 
the same factor-level on one factor-variable;

here is the sample code:


x - data.frame(rep(letters,2), rnorm(52), rnorm(52), rnorm(52))

aggregate(x, list(x[,1]), mean)


Now my problem is, that the actual data-set is much bigger (120 rows and 
approximately 100.000 columns) – and it takes very very long (actually at some 
point I just stopped it).

Is there anything that can be done to make the aggregate routine more efficient?
Or is there a different approach that would work faster?


Thanks for any suggestions!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] as.Date to as.POSIXct

2012-12-25 Thread arun
Hi,

You should have provided a reproducible example.
To convert in general, ?as.POSIXct()

A.K.




- Original Message -
From: Antonio Rodriges antonio@gmail.com
To: r-help@r-project.org
Cc: 
Sent: Tuesday, December 25, 2012 7:28 AM
Subject: [R] as.Date to as.POSIXct

Hello,

I have converted some UNIX time stamps with as.Date as follows

dates_unix - seq(
             as.Date(convertTimeToString(timeStart)),
             length = length(data),
             by=1 mon)

and now I would like to convert dates_unix from type Date to type POSIXct

How can I do that?


Thanks

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] for loop not working

2012-12-25 Thread eliza botto

Dear Arun,
thank-you very much. i m extremely sorry for spoiling your Christmas.
eliza 

 Date: Tue, 25 Dec 2012 08:51:05 -0800
 From: smartpink...@yahoo.com
 Subject: Re: [R] for loop not working
 To: eliza_bo...@hotmail.com
 CC: r-help@r-project.org; kri...@ymail.com
 
 HI Eliza,
 
 Try this:
 set.seed(15)
 mat1-matrix(sample(1:2000,1776,replace=TRUE),ncol=444)
 colnames(mat1)-paste(Col,1:444,sep=)
 res1-lapply(1:37,function(i) mat1[,seq(i,444,37)])
 res2-lapply(1:37,function(i) {a-mat1[,i:444];a[,c(TRUE,rep(FALSE,36))]}) 
 #your code
 identical(res1,res2)
 #[1] TRUE
 
 
 
 
 
 
 
 
 From: eliza botto eliza_bo...@hotmail.com
 To: smartpink...@yahoo.com smartpink...@yahoo.com 
 Cc: r-help@r-project.org r-help@r-project.org; kri...@ymail.com 
 Sent: Tuesday, December 25, 2012 1:57 AM
 Subject: RE: [R] for loop not working
 
 
 
 Dear Arun,
 as usuall you were spot on. i tried the following
 
 lapply(seq_len(ncol(e)), function(i) {
 
 a-e[,(e[i]:444)]
 
 a[,c(TRUE, rep(FALSE,36))]
 
 }) 
 
 but it never worked. 
 thanks for your kind help.
 lots of love
 
 elisa
 
 
  Date: Mon, 24 Dec 2012 22:40:08 -0800
  From: smartpink...@yahoo.com
  Subject: Re: [R] for loop not working
  To: eliza_bo...@hotmail.com
  CC: r-help@r-project.org; kri...@ymail.com
  
  HI Eliza,
  
  You could try this:
  set.seed(15)
  mat1-matrix(sample(1:2000,1776,replace=TRUE),ncol=444)
  colnames(mat1)-paste(Col,1:444,sep=)
  res-lapply(seq_len(ncol(mat1)),function(i) mat1[,seq(i,444,37)])
  
  #If you want only this from 1:37, then
   res1-lapply(1:37,function(i) mat1[,seq(i,444,37)])
  
  
  A.K.
  
  
  
  - Original Message -
  From: eliza botto eliza_bo...@hotmail.com
  To: r-help@r-project.org r-help@r-project.org
  Cc: 
  Sent: Tuesday, December 25, 2012 12:03 AM
  Subject: [R] for loop not working
  
  
  dear R family,i have a matrix of 444 columns. what i want to do is the 
  following.
  1. starting from column 1 i want to select every 37th column on the way. 
  more precisely i want to select column 1, 38,75,112,149 and so on.
  2.starting from column 2, i again want to select every 37th column. which 
  means 2,39,76,113,150 and so on.
  similarly starting from 3 till 37th column.
  i have tried following loop command which is not working.can anyone plz see 
  whats wrong in that?
  for (i in 1:37)
  
  {
  
  
  a-e[,e[i]:444]
  
  
  }
  
  
   lapply(seq_len(1),
  function(i) {
  
  
  a[,c(TRUE, rep(FALSE,1))]
  
  
  })
  extremly sorry for bothering you once again..
  eliza   
  [[alternative HTML version deleted]]
  
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
  
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sampling data without having infinite numbers after diong a transformation

2012-12-25 Thread Jeff Newmiller
Perhaps you should read the help file for rnorm more carefully.

?rnorm

Keep in mind that the normal probability distribution is a density function, so 
the smaller the standard deviation is, the greater the magnitude of the density 
function is. 
---
Jeff NewmillerThe .   .  Go Live...
DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

Agnes Ayang agnes.ay...@yahoo.com wrote:

Hello R-helpers..

I want to ask about how I can sample data sets without having the
infinite numbers coming out. For example,

set.seed(1234)

a-rnorm(15,0,1)
b-rnorm(15,0,1)
c-rnorm(15,0,1)
d-rnorm(15,0,36)

After come out with the sample, I need to do a transformation  (by
Hoaglin, 1985) for each data set. Actually I need to measure the
skewness and kurtosis, that's why I need to do the transformation.
After transformation, there will be 'Inf' value in my data sets and I
cannot proceed with the next step where I need to compute the trimmed
mean and sum square of deviation.

If anyone can help on how to obtain a better data sets so that my
programme will work. Thank you.

Best regards,
Hyo Min
UPM Malaysia

   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] aggregate / collapse big data frame efficiently

2012-12-25 Thread Jeff Newmiller
You might consider using the sqldf package.
---
Jeff NewmillerThe .   .  Go Live...
DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

Martin Batholdy batho...@googlemail.com wrote:

Hi,


I need to aggregate rows of a data.frame by computing the mean for rows
with the same factor-level on one factor-variable;

here is the sample code:


x - data.frame(rep(letters,2), rnorm(52), rnorm(52), rnorm(52))

aggregate(x, list(x[,1]), mean)


Now my problem is, that the actual data-set is much bigger (120 rows
and approximately 100.000 columns) – and it takes very very long
(actually at some point I just stopped it).

Is there anything that can be done to make the aggregate routine more
efficient?
Or is there a different approach that would work faster?


Thanks for any suggestions!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] aggregate / collapse big data frame efficiently

2012-12-25 Thread Patrick Burns

I'd suggest the 'data.table' package.  That is
one of the prime uses it was created for.

Pat

On 25/12/2012 16:34, Martin Batholdy wrote:

Hi,


I need to aggregate rows of a data.frame by computing the mean for rows with 
the same factor-level on one factor-variable;

here is the sample code:


x - data.frame(rep(letters,2), rnorm(52), rnorm(52), rnorm(52))

aggregate(x, list(x[,1]), mean)


Now my problem is, that the actual data-set is much bigger (120 rows and 
approximately 100.000 columns) – and it takes very very long (actually at some 
point I just stopped it).

Is there anything that can be done to make the aggregate routine more efficient?
Or is there a different approach that would work faster?


Thanks for any suggestions!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Patrick Burns
pbu...@pburns.seanet.com
twitter: @portfolioprobe
http://www.portfolioprobe.com/blog
http://www.burns-stat.com
(home of 'Some hints for the R beginner'
and 'The R Inferno')

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] aggregate / collapse big data frame efficiently

2012-12-25 Thread jim holtman
According to the way that you have used 'aggregate', you are taking
the column means.  Couple of suggestions for faster processing:


1. use matrices instead of data.frames ( i converted your example just
before using it)
2, use the 'colMeans'

I created a 120 x 10 matrix with 10 levels and its does the
computation in less than 2 seconds:


  n - 10
 nLevels - 10
 nRows - 120
 Cols - list(rep(list(sample(nRows)), n))
 df - data.frame(levels = sample(nLevels, nRows, TRUE), Cols)
 colnames(df)[-1] - paste0('col', 1:n)

 # convert to matrix for faster processing
 df.m - as.matrix(df[, -1])  # remove levels column
 str(df.m)
 int [1:120, 1:10] 111 13 106 61 16 39 25 94 53 38 ...
 - attr(*, dimnames)=List of 2
  ..$ : NULL
  ..$ : chr [1:10] col1 col2 col3 col4 ...
 system.time({
+ # split the indices of rows for each level
+ x - split(seq(nrow(df)), df$levels)
+ result - sapply(x, function(a) colMeans(df.m[a, ]))
+ })
   user  system elapsed
   1.330.001.35
 str(result)
 num [1:10, 1:10] 57 57 57 57 57 57 57 57 57 57 ...
 - attr(*, dimnames)=List of 2
  ..$ : chr [1:10] col1 col2 col3 col4 ...
  ..$ : chr [1:10] 1 2 3 4 ...



On Tue, Dec 25, 2012 at 11:34 AM, Martin Batholdy
batho...@googlemail.com wrote:
 Hi,


 I need to aggregate rows of a data.frame by computing the mean for rows with 
 the same factor-level on one factor-variable;

 here is the sample code:


 x - data.frame(rep(letters,2), rnorm(52), rnorm(52), rnorm(52))

 aggregate(x, list(x[,1]), mean)


 Now my problem is, that the actual data-set is much bigger (120 rows and 
 approximately 100.000 columns) – and it takes very very long (actually at 
 some point I just stopped it).

 Is there anything that can be done to make the aggregate routine more 
 efficient?
 Or is there a different approach that would work faster?


 Thanks for any suggestions!

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] apply with missing values

2012-12-25 Thread David Winsemius


On Dec 24, 2012, at 11:21 AM, jenna wrote:

I am trying to get the means of each participants avg saccade  
amplitude as a
function of the group they were in (designated by shape_), but  
there are

missing values in the datasetthis is what i tried...

with(data055,tapply(AVERAGE_SACCADE_AMPLITUDE,shape_,mean))
i get an error saying the argument is not numerical or logical


The error message appears informative (and would NOT be expected to be  
solved by adding na.rm=TRUE). Your construction of the data055  
dataframe was flawed in some way. You have not shown how it was done  
nor given dput(head(data055)) which would have provided a basis for  
further assessment. Based on past experience my guess is that there  
was a character value in the column of data you thought was numeric  
during a read.* operation and is now a factor.





next i try...


055  - tapply(data055$AVERAGE_SACCADE_AMPLITUDE,

+ data055$shape_,mean, na.rm =TRUE)

still nothing

--

David Winsemius, MD
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] pscore.dist problem when running optmatch

2012-12-25 Thread Pascal Oettli

Hello,

Did you contact the package maintainer?
Mark M. Fredrickson mark.m.fredrick...@gmail.com

There is also a webpage:
https://github.com/markmfredrickson/optmatch

Regards,
Pascal

Le 18/12/2012 21:37, MA a écrit :

Hello

My optmatch package is loaded and otherwise running fine.
I get an error after lcds successfully completes logistic regression and
I'm trying to obtain a propensity score:


pdist - pscore.dist(lcds)

Error: could not find function pscore.dist

I searched the help files, other online sources, could find no answer for
this.
Any advice would be greatly appreciated!

Thank you
Michael Adolph

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] aggregate / collapse big data frame efficiently

2012-12-25 Thread Wensui Liu
aggregate() is not efficient. try by().


On Tue, Dec 25, 2012 at 11:34 AM, Martin Batholdy
batho...@googlemail.comwrote:

 Hi,


 I need to aggregate rows of a data.frame by computing the mean for rows
 with the same factor-level on one factor-variable;

 here is the sample code:


 x - data.frame(rep(letters,2), rnorm(52), rnorm(52), rnorm(52))

 aggregate(x, list(x[,1]), mean)


 Now my problem is, that the actual data-set is much bigger (120 rows and
 approximately 100.000 columns) – and it takes very very long (actually at
 some point I just stopped it).

 Is there anything that can be done to make the aggregate routine more
 efficient?
 Or is there a different approach that would work faster?


 Thanks for any suggestions!

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
==
WenSui Liu
Credit Risk Manager, 53 Bancorp
wensui@53.com
513-295-4370
==

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] for loop not working

2012-12-25 Thread arun
HI Eliza,

Try this:
set.seed(15)
mat1-matrix(sample(1:2000,1776,replace=TRUE),ncol=444)
colnames(mat1)-paste(Col,1:444,sep=)
res1-lapply(1:37,function(i) mat1[,seq(i,444,37)])
res2-lapply(1:37,function(i) {a-mat1[,i:444];a[,c(TRUE,rep(FALSE,36))]}) 
#your code
identical(res1,res2)
#[1] TRUE








From: eliza botto eliza_bo...@hotmail.com
To: smartpink...@yahoo.com smartpink...@yahoo.com 
Cc: r-help@r-project.org r-help@r-project.org; kri...@ymail.com 
Sent: Tuesday, December 25, 2012 1:57 AM
Subject: RE: [R] for loop not working



Dear Arun,
as usuall you were spot on. i tried the following

lapply(seq_len(ncol(e)), function(i) {

a-e[,(e[i]:444)]

a[,c(TRUE, rep(FALSE,36))]

}) 

but it never worked. 
thanks for your kind help.
lots of love

elisa


 Date: Mon, 24 Dec 2012 22:40:08 -0800
 From: smartpink...@yahoo.com
 Subject: Re: [R] for loop not working
 To: eliza_bo...@hotmail.com
 CC: r-help@r-project.org; kri...@ymail.com
 
 HI Eliza,
 
 You could try this:
 set.seed(15)
 mat1-matrix(sample(1:2000,1776,replace=TRUE),ncol=444)
 colnames(mat1)-paste(Col,1:444,sep=)
 res-lapply(seq_len(ncol(mat1)),function(i) mat1[,seq(i,444,37)])
 
 #If you want only this from 1:37, then
  res1-lapply(1:37,function(i) mat1[,seq(i,444,37)])
 
 
 A.K.
 
 
 
 - Original Message -
 From: eliza botto eliza_bo...@hotmail.com
 To: r-help@r-project.org r-help@r-project.org
 Cc: 
 Sent: Tuesday, December 25, 2012 12:03 AM
 Subject: [R] for loop not working
 
 
 dear R family,i have a matrix of 444 columns. what i want to do is the 
 following.
 1. starting from column 1 i want to select every 37th column on the way. more 
 precisely i want to select column 1, 38,75,112,149 and so on.
 2.starting from column 2, i again want to select every 37th column. which 
 means 2,39,76,113,150 and so on.
 similarly starting from 3 till 37th column.
 i have tried following loop command which is not working.can anyone plz see 
 whats wrong in that?
 for (i in 1:37)
 
 {
 
 
 a-e[,e[i]:444]
 
 
 }
 
 
  lapply(seq_len(1),
 function(i) {
 
 
 a[,c(TRUE, rep(FALSE,1))]
 
 
 })
 extremly sorry for bothering you once again..
 eliza               
     [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] aggregate / collapse big data frame efficiently

2012-12-25 Thread arun
Hi,

Jim's method was found to be faster than data.table()

n - 1
 nLevels - 10
 nRows - 120
 Cols - list(rep(list(sample(nRows)), n))
 df - data.frame(levels = sample(nLevels, nRows, TRUE), Cols)
 colnames(df)[-1] - paste0('col', 1:n) 
 # convert to matrix for faster processing
 df.m - as.matrix(df[, -1])  # remove levels column 
 system.time({
 # split the indices of rows for each level
 x - split(seq(nrow(df)), df$levels)
 result - sapply(x, function(a) colMeans(df.m[a, ]))
 }) 
 # user  system elapsed 
# 0.056   0.000   0.056   


library(data.table)
df.dt-data.table(df)
setkey(df.dt,levels)
 system.time({ result1- df.dt[,lapply(.SD,mean),by=levels]})
#  user  system elapsed 
#  7.756   0.000   7.771 
 system.time({result2-df.dt[,list(Mean=colMeans(.SD)),by=levels]})
# user  system elapsed 
 # 2.188   0.000   2.193  


A.K.



- Original Message -
From: jim holtman jholt...@gmail.com
To: Martin Batholdy batho...@googlemail.com
Cc: r-help@r-project.org r-help@r-project.org
Sent: Tuesday, December 25, 2012 1:20 PM
Subject: Re: [R] aggregate / collapse big data frame efficiently

According to the way that you have used 'aggregate', you are taking
the column means.  Couple of suggestions for faster processing:


1. use matrices instead of data.frames ( i converted your example just
before using it)
2, use the 'colMeans'

I created a 120 x 10 matrix with 10 levels and its does the
computation in less than 2 seconds:


  n - 10
 nLevels - 10
 nRows - 120
 Cols - list(rep(list(sample(nRows)), n))
 df - data.frame(levels = sample(nLevels, nRows, TRUE), Cols)
 colnames(df)[-1] - paste0('col', 1:n)

 # convert to matrix for faster processing
 df.m - as.matrix(df[, -1])  # remove levels column
 str(df.m)
int [1:120, 1:10] 111 13 106 61 16 39 25 94 53 38 ...
- attr(*, dimnames)=List of 2
  ..$ : NULL
  ..$ : chr [1:10] col1 col2 col3 col4 ...
 system.time({
+ # split the indices of rows for each level
+ x - split(seq(nrow(df)), df$levels)
+ result - sapply(x, function(a) colMeans(df.m[a, ]))
+ })
   user  system elapsed
   1.33    0.00    1.35
 str(result)
num [1:10, 1:10] 57 57 57 57 57 57 57 57 57 57 ...
- attr(*, dimnames)=List of 2
  ..$ : chr [1:10] col1 col2 col3 col4 ...
  ..$ : chr [1:10] 1 2 3 4 ...



On Tue, Dec 25, 2012 at 11:34 AM, Martin Batholdy
batho...@googlemail.com wrote:
 Hi,


 I need to aggregate rows of a data.frame by computing the mean for rows with 
 the same factor-level on one factor-variable;

 here is the sample code:


 x - data.frame(rep(letters,2), rnorm(52), rnorm(52), rnorm(52))

 aggregate(x, list(x[,1]), mean)


 Now my problem is, that the actual data-set is much bigger (120 rows and 
 approximately 100.000 columns) – and it takes very very long (actually at 
 some point I just stopped it).

 Is there anything that can be done to make the aggregate routine more 
 efficient?
 Or is there a different approach that would work faster?


 Thanks for any suggestions!

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] aggregate / collapse big data frame efficiently

2012-12-25 Thread arun
Hi,
You could use library(data.table) 
x - data.frame(A=rep(letters,2), B=rnorm(52), C=rnorm(52), D=rnorm(52))
res- with(x,aggregate(cbind(B,C,D),by=list(A),mean))
colnames(res)[1]-A

 x1-data.table(x)
res2- x1[,list(B=mean(B),C=mean(C),D=mean(D)),by=A]
 identical(res,data.frame(res2))
#[1] TRUE

Just for comparison:
set.seed(25)
xnew-data.frame(A=rep(letters,1500),B=rnorm(39000),C=rnorm(39000),D=rnorm(39000))
system.time(resnew-with(xnew,aggregate(cbind(B,C,D),by=list(A),mean)))
 #user  system elapsed 
 # 0.152   0.000   0.152 

xnew1-data.table(xnew)
system.time(resnew1- xnew1[,list(B=mean(B),C=mean(C),D=mean(D)),by=A])
# user  system elapsed 
 # 0.004   0.000   0.005 



A.K.




- Original Message -
From: Martin Batholdy batho...@googlemail.com
To: r-help@r-project.org r-help@r-project.org
Cc: 
Sent: Tuesday, December 25, 2012 11:34 AM
Subject: [R] aggregate / collapse big data frame efficiently

Hi,


I need to aggregate rows of a data.frame by computing the mean for rows with 
the same factor-level on one factor-variable;

here is the sample code:


x - data.frame(rep(letters,2), rnorm(52), rnorm(52), rnorm(52))

aggregate(x, list(x[,1]), mean)


Now my problem is, that the actual data-set is much bigger (120 rows and 
approximately 100.000 columns) – and it takes very very long (actually at some 
point I just stopped it).

Is there anything that can be done to make the aggregate routine more efficient?
Or is there a different approach that would work faster?


Thanks for any suggestions!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] splitting a long dataframe

2012-12-25 Thread Swagath

Dear all...Merry Christmas

I would like to split a long dataframe. The dataframe looks like this

x-c('0:00:00', '0:30:00', '1:00:00', '1:30:00', '2:00:00', '2:30:00', 
'3:00:00', '0:00:00', '0:30:00', '1:00:00', '1:30:00', '2:00:00', 
'2:30:00', '3:00:00', '3:30:00', '4:00:00','0:00:00', '0:30:00', 
'1:00:00', '1:30:00', '2:00:00', '2:30:00', '3:00:00', '0:00:00', 
'0:30:00', '1:00:00', '1:30:00', '2:00:00', '2:30:00', '3:00:00' , 
'3:30:00', '4:00:00')


y=seq(1:32)

data1=data.frame(x,y)

i want to split in such a way that the output looks like

0:00:00  1  8 17 24
0:30:00  2  9 18 25
1:00:00  3 10 19 26
1:30:00  4 11 20 27
2:00:00  5 12 21 28
2:30:00  6 13 22 29
3:00:00  7 14 23 30
3:30:00 NA 15 NA 31
4:00:00 NA 16 NA 32

any ideas or functions that i look into for doing this?
Thanks a lot for your help and time.

Cheers,
Swagath

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] path analysis

2012-12-25 Thread Ali Mahmoudi
What's the function of 'path analysis ' to do it with R?
Please help me.Thanks.
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] path analysis

2012-12-25 Thread Pascal Oettli

First, hello,

Second, http://r.789695.n4.nabble.com/path-analysis-td2528558.html#a2530207

Last, Regards

Le 26/12/2012 04:11, Ali Mahmoudi a écrit :

What's the function of 'path analysis ' to do it with R?
Please help me.Thanks.
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] splitting a long dataframe

2012-12-25 Thread David Winsemius


On Dec 25, 2012, at 9:52 AM, Swagath wrote:


Dear all...Merry Christmas

I would like to split a long dataframe. The dataframe looks like this

x-c('0:00:00', '0:30:00', '1:00:00', '1:30:00', '2:00:00',  
'2:30:00', '3:00:00', '0:00:00', '0:30:00', '1:00:00', '1:30:00',  
'2:00:00', '2:30:00', '3:00:00', '3:30:00', '4:00:00','0:00:00',  
'0:30:00', '1:00:00', '1:30:00', '2:00:00', '2:30:00', '3:00:00',  
'0:00:00', '0:30:00', '1:00:00', '1:30:00', '2:00:00', '2:30:00',  
'3:00:00' , '3:30:00', '4:00:00')


y=seq(1:32)

data1=data.frame(x,y)

i want to split in such a way that the output looks like

0:00:00  1  8 17 24
0:30:00  2  9 18 25
1:00:00  3 10 19 26
1:30:00  4 11 20 27
2:00:00  5 12 21 28
2:30:00  6 13 22 29
3:00:00  7 14 23 30
3:30:00 NA 15 NA 31
4:00:00 NA 16 NA 32

any ideas or functions that i look into for doing this?


You already have 3 distinct solutions on StackOverflow.





David Winsemius, MD
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.