On Fri, 14 Nov 2003, Petr Pikal wrote: > Dear all > > I need to find a length of true sequences in logical vector (see example 1). I found > a possible solution which is good but if I use it on a larger data set I experience > a > substantial decrease in performance (example 2). > > Example 1 > set.seed(111) > x <- sample(c(T,F),50, replace=T) > system.time(cetnost <- as.numeric(table(which(x)-cumsum(x[which(x)])))) > [1] 0.00 0.00 0.03 NA NA > cetnost > [1] 1 3 2 5 1 4 1 1 1 3 1 1 2
Have you looked at rle()? > rlex <- rle(x) > str(rlex) List of 2 $ lengths: int [1:27] 2 1 1 3 1 2 2 5 1 1 ... $ values : logi [1:27] FALSE TRUE FALSE TRUE FALSE TRUE FALSE TRUE FALSE TRUE FALSE TRUE ... - attr(*, "class")= chr "rle" > rlex$lengths[rlex$values] [1] 1 3 2 5 1 4 1 1 1 3 1 1 2 > cetnost [1] 1 3 2 5 1 4 1 1 1 3 1 1 2 rle() is interpreted too, like your solution, so I'm not sure how it will scale. > > Example 2 > x<-sample(c(T,F),40321*51, replace=T) > dd<-matrix(x,40321,51) > system.time(cetnost <- lapply(dd,function(x) as.numeric(table(which(x)- > cumsum(x[which(x)]))))) > Timing stopped at: 750.63 1 775.6 NA NA > > Please give me any hint how to improve performance or advice a different (but > more effective) solution. > > R 1.8.0, W2000, 512M memory, Pentium4 > > Thank you in advance. > > > > Petr Pikal > [EMAIL PROTECTED] > > ______________________________________________ > [EMAIL PROTECTED] mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-help > -- Roger Bivand Economic Geography Section, Department of Economics, Norwegian School of Economics and Business Administration, Breiviksveien 40, N-5045 Bergen, Norway. voice: +47 55 95 93 55; fax +47 55 95 93 93 e-mail: [EMAIL PROTECTED] ______________________________________________ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
