Re: [R] Split a vector by NA's - is there a better solution then a loop ?
Maybe this : foo - function( x ){ + idx - 1 + cumsum( is.na( x ) ) + not.na - ! is.na( x ) + split( x[not.na], idx[not.na] ) + } foo( x ) $`1` [1] 2 1 2 $`2` [1] 1 1 2 $`3` [1] 4 5 2 3 Romain Le 29/04/10 09:42, Tal Galili a écrit : Hi all, I would like to have a function like this: split.vec.by.NA- function(x) That takes a vector like this: x- c(2,1,2,NA,1,1,2,NA,4,5,2,3) And returns a list of length of 3, each element of the list is the relevant segmented vector, like this: $`1` [1] 2 1 2 $`2` [1] 1 1 2 $`3` [1] 4 5 2 3 I found how to do it with a loop, but wondered if there is some smarter (vectorized) way of doing it. Here is the code I used: x- c(2,1,2,NA,1,1,2,NA,4,5,2,3) split.vec.by.NA- function(x) { # assumes NA are seperating groups of numbers #TODO: add code to check for it number.of.groups- sum(is.na(x)) + 1 groups.end.point.locations- c(which(is.na(x)), length(x)+1) # This will be all the places with NA's + a nubmer after the ending of the vector group.start- 1 group.end- NA new.groups.split.id- x # we will replace all the places of the group with group ID, excapt for the NA, which will later be replaced by 0 for(i in seq_len(number.of.groups)) { group.end- groups.end.point.locations[i]-1 new.groups.split.id[group.start:group.end]- i group.start- groups.end.point.locations[i]+1 # make the new group start higher for the next loop (at the final loop it won't matter } new.groups.split.id[is.na(x)]- 0 return(split(x, new.groups.split.id)[-1]) } split.vec.by.NA(x) Thanks, Tal -- Romain Francois Professional R Enthusiast +33(0) 6 28 91 30 30 http://romainfrancois.blog.free.fr |- http://bit.ly/9aKDM9 : embed images in Rd documents |- http://tr.im/OIXN : raster images and RImageJ |- http://tr.im/OcQe : Rcpp 0.7.7 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Split a vector by NA's - is there a better solution then a loop ?
Definitely Smarter, Thanks! Tal Contact Details:--- Contact me: tal.gal...@gmail.com | 972-52-7275845 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | www.r-statistics.com (English) -- On Thu, Apr 29, 2010 at 10:56 AM, Romain Francois romain.franc...@dbmail.com wrote: Maybe this : foo - function( x ){ + idx - 1 + cumsum( is.na( x ) ) + not.na - ! is.na( x ) + split( x[not.na], idx[not.na] ) + } foo( x ) $`1` [1] 2 1 2 $`2` [1] 1 1 2 $`3` [1] 4 5 2 3 Romain Le 29/04/10 09:42, Tal Galili a écrit : Hi all, I would like to have a function like this: split.vec.by.NA- function(x) That takes a vector like this: x- c(2,1,2,NA,1,1,2,NA,4,5,2,3) And returns a list of length of 3, each element of the list is the relevant segmented vector, like this: $`1` [1] 2 1 2 $`2` [1] 1 1 2 $`3` [1] 4 5 2 3 I found how to do it with a loop, but wondered if there is some smarter (vectorized) way of doing it. Here is the code I used: x- c(2,1,2,NA,1,1,2,NA,4,5,2,3) split.vec.by.NA- function(x) { # assumes NA are seperating groups of numbers #TODO: add code to check for it number.of.groups- sum(is.na(x)) + 1 groups.end.point.locations- c(which(is.na(x)), length(x)+1) # This will be all the places with NA's + a nubmer after the ending of the vector group.start- 1 group.end- NA new.groups.split.id- x # we will replace all the places of the group with group ID, excapt for the NA, which will later be replaced by 0 for(i in seq_len(number.of.groups)) { group.end- groups.end.point.locations[i]-1 new.groups.split.id[group.start:group.end]- i group.start- groups.end.point.locations[i]+1 # make the new group start higher for the next loop (at the final loop it won't matter } new.groups.split.id[is.na(x)]- 0 return(split(x, new.groups.split.id)[-1]) } split.vec.by.NA(x) Thanks, Tal -- Romain Francois Professional R Enthusiast +33(0) 6 28 91 30 30 http://romainfrancois.blog.free.fr |- http://bit.ly/9aKDM9 : embed images in Rd documents |- http://tr.im/OIXN : raster images and RImageJ |- http://tr.im/OcQe : Rcpp 0.7.7 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Split a vector by NA's - is there a better solution then a loop ?
Another option could be: split(x, replace(cumsum(is.na(x)), is.na(x), -1))[-1] On Thu, Apr 29, 2010 at 4:42 AM, Tal Galili tal.gal...@gmail.com wrote: Hi all, I would like to have a function like this: split.vec.by.NA - function(x) That takes a vector like this: x - c(2,1,2,NA,1,1,2,NA,4,5,2,3) And returns a list of length of 3, each element of the list is the relevant segmented vector, like this: $`1` [1] 2 1 2 $`2` [1] 1 1 2 $`3` [1] 4 5 2 3 I found how to do it with a loop, but wondered if there is some smarter (vectorized) way of doing it. Here is the code I used: x - c(2,1,2,NA,1,1,2,NA,4,5,2,3) split.vec.by.NA - function(x) { # assumes NA are seperating groups of numbers #TODO: add code to check for it number.of.groups - sum(is.na(x)) + 1 groups.end.point.locations - c(which(is.na(x)), length(x)+1) # This will be all the places with NA's + a nubmer after the ending of the vector group.start - 1 group.end - NA new.groups.split.id - x # we will replace all the places of the group with group ID, excapt for the NA, which will later be replaced by 0 for(i in seq_len(number.of.groups)) { group.end - groups.end.point.locations[i]-1 new.groups.split.id[group.start:group.end] - i group.start - groups.end.point.locations[i]+1 # make the new group start higher for the next loop (at the final loop it won't matter } new.groups.split.id[is.na(x)] - 0 return(split(x, new.groups.split.id)[-1]) } split.vec.by.NA(x) Thanks, Tal Contact Details:--- Contact me: tal.gal...@gmail.com | 972-52-7275845 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | www.r-statistics.com (English) -- [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Split a vector by NA's - is there a better solution then a loop ?
On Thu, Apr 29, 2010 at 1:27 PM, Henrique Dallazuanna www...@gmail.com wrote: Another option could be: split(x, replace(cumsum(is.na(x)), is.na(x), -1))[-1] One thing none of the solutions so far do (except I haven't tried Tal's original code) is insert an empty group between adjacent NA values, for example in: x = c(1,2,3,NA,NA,4,5,6) split(x, replace(cumsum(is.na(x)), is.na(x), -1))[-1] $`0` [1] 1 2 3 $`2` [1] 4 5 6 Maybe this never happens in Tal's case, or it's not what he wanted anyway, but I thought I'd point it out! Barry __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Split a vector by NA's - is there a better solution then a loop ?
On Thu, 29 Apr 2010, Barry Rowlingson wrote: On Thu, Apr 29, 2010 at 1:27 PM, Henrique Dallazuanna www...@gmail.com wrote: Another option could be: split(x, replace(cumsum(is.na(x)), is.na(x), -1))[-1] One thing none of the solutions so far do (except I haven't tried Tal's original code) is insert an empty group between adjacent NA values, for example in: x = c(1,2,3,NA,NA,4,5,6) split(x, replace(cumsum(is.na(x)), is.na(x), -1))[-1] $`0` [1] 1 2 3 $`2` [1] 4 5 6 Maybe this never happens in Tal's case, or it's not what he wanted anyway, but I thought I'd point it out! The ever useful rle() helps y - rle(!is.na(x)) split(x, rep( cumsum(y$val)*y$val, y$len ) )[-1] $`1` [1] 1 2 3 $`2` [1] 4 5 6 Chuck Barry Charles C. Berry(858) 534-2098 Dept of Family/Preventive Medicine E mailto:cbe...@tajo.ucsd.edu UC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Split a vector by NA's - is there a better solution then a loop ?
Or, you can modify Romain's function to account for sequential NAs. x - c(1,2,NA,1,1,2,NA,NA,4,5,2,3) foo - function( x ){ idx - 1 + cumsum( is.na( x ) ) not.na - ! is.na( x ) f-factor(idx[not.na],levels=1:max(idx)) split( x[not.na], f ) } $`1` [1] 1 2 $`2` [1] 1 1 2 $`3` numeric(0) $`4` [1] 4 5 2 3 -tgs On Thu, Apr 29, 2010 at 4:00 AM, Tal Galili tal.gal...@gmail.com wrote: Definitely Smarter, Thanks! Tal Contact Details:--- Contact me: tal.gal...@gmail.com | 972-52-7275845 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | www.r-statistics.com (English) -- On Thu, Apr 29, 2010 at 10:56 AM, Romain Francois romain.franc...@dbmail.com wrote: Maybe this : foo - function( x ){ + idx - 1 + cumsum( is.na( x ) ) + not.na - ! is.na( x ) + split( x[not.na], idx[not.na] ) + } foo( x ) $`1` [1] 2 1 2 $`2` [1] 1 1 2 $`3` [1] 4 5 2 3 Romain Le 29/04/10 09:42, Tal Galili a écrit : Hi all, I would like to have a function like this: split.vec.by.NA- function(x) That takes a vector like this: x- c(2,1,2,NA,1,1,2,NA,4,5,2,3) And returns a list of length of 3, each element of the list is the relevant segmented vector, like this: $`1` [1] 2 1 2 $`2` [1] 1 1 2 $`3` [1] 4 5 2 3 I found how to do it with a loop, but wondered if there is some smarter (vectorized) way of doing it. Here is the code I used: x- c(2,1,2,NA,1,1,2,NA,4,5,2,3) split.vec.by.NA- function(x) { # assumes NA are seperating groups of numbers #TODO: add code to check for it number.of.groups- sum(is.na(x)) + 1 groups.end.point.locations- c(which(is.na(x)), length(x)+1) # This will be all the places with NA's + a nubmer after the ending of the vector group.start- 1 group.end- NA new.groups.split.id- x # we will replace all the places of the group with group ID, excapt for the NA, which will later be replaced by 0 for(i in seq_len(number.of.groups)) { group.end- groups.end.point.locations[i]-1 new.groups.split.id[group.start:group.end]- i group.start- groups.end.point.locations[i]+1 # make the new group start higher for the next loop (at the final loop it won't matter } new.groups.split.id[is.na(x)]- 0 return(split(x, new.groups.split.id)[-1]) } split.vec.by.NA(x) Thanks, Tal -- Romain Francois Professional R Enthusiast +33(0) 6 28 91 30 30 http://romainfrancois.blog.free.fr |- http://bit.ly/9aKDM9 : embed images in Rd documents |- http://tr.im/OIXN : raster images and RImageJ |- http://tr.im/OcQe : Rcpp 0.7.7 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.