subject:"Re\: \[R\] Split a vector by NA's \- is there a better solution then a loop \?"

Re: [R] Split a vector by NA's - is there a better solution then a loop ?

2010-04-29 Thread Romain Francois


Maybe this :

 foo - function( x ){
+   idx - 1 + cumsum( is.na( x ) )
+   not.na - ! is.na( x )
+   split( x[not.na], idx[not.na] )
+ }
 foo( x )
$`1`
[1] 2 1 2

$`2`
[1] 1 1 2

$`3`
[1] 4 5 2 3

Romain

Le 29/04/10 09:42, Tal Galili a écrit :


Hi all,

I would like to have a function like this:
split.vec.by.NA- function(x)

That takes a vector like this:
x- c(2,1,2,NA,1,1,2,NA,4,5,2,3)

And returns a list of length of 3, each element of the list is the relevant
segmented vector, like this:

$`1`
[1] 2 1 2
$`2`
[1] 1 1 2
$`3`
[1] 4 5 2 3


I found how to do it with a loop, but wondered if there is some smarter
(vectorized) way of doing it.



Here is the code I used:

x- c(2,1,2,NA,1,1,2,NA,4,5,2,3)


split.vec.by.NA- function(x)
{
# assumes NA are seperating groups of numbers
#TODO: add code to check for it

number.of.groups- sum(is.na(x)) + 1
groups.end.point.locations- c(which(is.na(x)), length(x)+1) # This will be
all the places with NA's + a nubmer after the ending of the vector
  group.start- 1
group.end- NA
new.groups.split.id- x # we will replace all the places of the group with
group ID, excapt for the NA, which will later be replaced by 0
  for(i in seq_len(number.of.groups))
{
group.end- groups.end.point.locations[i]-1
  new.groups.split.id[group.start:group.end]- i
  group.start- groups.end.point.locations[i]+1 # make the new group start
higher for the next loop (at the final loop it won't matter
  }
  new.groups.split.id[is.na(x)]- 0
  return(split(x, new.groups.split.id)[-1])
}

split.vec.by.NA(x)




Thanks,
Tal


--
Romain Francois
Professional R Enthusiast
+33(0) 6 28 91 30 30
http://romainfrancois.blog.free.fr
|- http://bit.ly/9aKDM9 : embed images in Rd documents
|- http://tr.im/OIXN : raster images and RImageJ
|- http://tr.im/OcQe : Rcpp 0.7.7

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Split a vector by NA's - is there a better solution then a loop ?

2010-04-29 Thread Tal Galili

Definitely Smarter,
Thanks!

Tal

Contact
Details:---
Contact me: tal.gal...@gmail.com |  972-52-7275845
Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
www.r-statistics.com (English)
--




On Thu, Apr 29, 2010 at 10:56 AM, Romain Francois 
romain.franc...@dbmail.com wrote:

 Maybe this :

  foo - function( x ){
 +   idx - 1 + cumsum( is.na( x ) )
 +   not.na - ! is.na( x )
 +   split( x[not.na], idx[not.na] )
 + }
  foo( x )

 $`1`
 [1] 2 1 2

 $`2`
 [1] 1 1 2

 $`3`
 [1] 4 5 2 3

 Romain

 Le 29/04/10 09:42, Tal Galili a écrit :


 Hi all,

 I would like to have a function like this:
 split.vec.by.NA- function(x)

 That takes a vector like this:
 x- c(2,1,2,NA,1,1,2,NA,4,5,2,3)

 And returns a list of length of 3, each element of the list is the
 relevant
 segmented vector, like this:

 $`1`
 [1] 2 1 2
 $`2`
 [1] 1 1 2
 $`3`
 [1] 4 5 2 3


 I found how to do it with a loop, but wondered if there is some smarter
 (vectorized) way of doing it.



 Here is the code I used:

 x- c(2,1,2,NA,1,1,2,NA,4,5,2,3)


 split.vec.by.NA- function(x)
 {
 # assumes NA are seperating groups of numbers
 #TODO: add code to check for it

 number.of.groups- sum(is.na(x)) + 1
 groups.end.point.locations- c(which(is.na(x)), length(x)+1) # This will
 be
 all the places with NA's + a nubmer after the ending of the vector
  group.start- 1
 group.end- NA
 new.groups.split.id- x # we will replace all the places of the group
 with
 group ID, excapt for the NA, which will later be replaced by 0
  for(i in seq_len(number.of.groups))
 {
 group.end- groups.end.point.locations[i]-1
  new.groups.split.id[group.start:group.end]- i
  group.start- groups.end.point.locations[i]+1 # make the new group start
 higher for the next loop (at the final loop it won't matter
  }
  new.groups.split.id[is.na(x)]- 0
  return(split(x, new.groups.split.id)[-1])
 }

 split.vec.by.NA(x)




 Thanks,
 Tal


 --
 Romain Francois
 Professional R Enthusiast
 +33(0) 6 28 91 30 30
 http://romainfrancois.blog.free.fr
 |- http://bit.ly/9aKDM9 : embed images in Rd documents
 |- http://tr.im/OIXN : raster images and RImageJ
 |- http://tr.im/OcQe : Rcpp 0.7.7




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Split a vector by NA's - is there a better solution then a loop ?

2010-04-29 Thread Henrique Dallazuanna

Another option could be:

split(x, replace(cumsum(is.na(x)), is.na(x), -1))[-1]

On Thu, Apr 29, 2010 at 4:42 AM, Tal Galili tal.gal...@gmail.com wrote:

 Hi all,

 I would like to have a function like this:
 split.vec.by.NA - function(x)

 That takes a vector like this:
 x - c(2,1,2,NA,1,1,2,NA,4,5,2,3)

 And returns a list of length of 3, each element of the list is the relevant
 segmented vector, like this:

 $`1`
 [1] 2 1 2
 $`2`
 [1] 1 1 2
 $`3`
 [1] 4 5 2 3


 I found how to do it with a loop, but wondered if there is some smarter
 (vectorized) way of doing it.



 Here is the code I used:

 x - c(2,1,2,NA,1,1,2,NA,4,5,2,3)


 split.vec.by.NA - function(x)
 {
 # assumes NA are seperating groups of numbers
 #TODO: add code to check for it

 number.of.groups - sum(is.na(x)) + 1
 groups.end.point.locations - c(which(is.na(x)), length(x)+1) # This will
 be
 all the places with NA's + a nubmer after the ending of the vector
  group.start - 1
 group.end - NA
 new.groups.split.id - x # we will replace all the places of the group
 with
 group ID, excapt for the NA, which will later be replaced by 0
  for(i in seq_len(number.of.groups))
 {
 group.end - groups.end.point.locations[i]-1
  new.groups.split.id[group.start:group.end] - i
  group.start - groups.end.point.locations[i]+1 # make the new group start
 higher for the next loop (at the final loop it won't matter
  }
  new.groups.split.id[is.na(x)] - 0
  return(split(x, new.groups.split.id)[-1])
 }

 split.vec.by.NA(x)




 Thanks,
 Tal




 Contact
 Details:---
 Contact me: tal.gal...@gmail.com |  972-52-7275845
 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
 www.r-statistics.com (English)

 --

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Split a vector by NA's - is there a better solution then a loop ?

2010-04-29 Thread Barry Rowlingson

On Thu, Apr 29, 2010 at 1:27 PM, Henrique Dallazuanna www...@gmail.com wrote:
 Another option could be:

 split(x, replace(cumsum(is.na(x)), is.na(x), -1))[-1]


One thing none of the solutions so far do (except I haven't tried
Tal's original code) is insert an empty group between adjacent NA
values, for example in:

 x = c(1,2,3,NA,NA,4,5,6)

  split(x, replace(cumsum(is.na(x)), is.na(x), -1))[-1]
$`0`
[1] 1 2 3

$`2`
[1] 4 5 6

Maybe this never happens in Tal's case, or it's not what he wanted
anyway, but I thought I'd point it out!

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Split a vector by NA's - is there a better solution then a loop ?

2010-04-29 Thread Charles C. Berry


On Thu, 29 Apr 2010, Barry Rowlingson wrote:


On Thu, Apr 29, 2010 at 1:27 PM, Henrique Dallazuanna www...@gmail.com wrote:

Another option could be:

split(x, replace(cumsum(is.na(x)), is.na(x), -1))[-1]



One thing none of the solutions so far do (except I haven't tried
Tal's original code) is insert an empty group between adjacent NA
values, for example in:

x = c(1,2,3,NA,NA,4,5,6)

 split(x, replace(cumsum(is.na(x)), is.na(x), -1))[-1]
$`0`
[1] 1 2 3

$`2`
[1] 4 5 6

Maybe this never happens in Tal's case, or it's not what he wanted
anyway, but I thought I'd point it out!


The ever useful rle() helps


y - rle(!is.na(x))
split(x, rep( cumsum(y$val)*y$val, y$len ) )[-1]

$`1`
[1] 1 2 3

$`2`
[1] 4 5 6


Chuck



Barry



Charles C. Berry(858) 534-2098
Dept of Family/Preventive Medicine
E mailto:cbe...@tajo.ucsd.edu   UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Split a vector by NA's - is there a better solution then a loop ?

2010-04-29 Thread Thomas Stewart

Or, you can modify Romain's function to account for sequential NAs.

x - c(1,2,NA,1,1,2,NA,NA,4,5,2,3)
foo - function( x ){
   idx - 1 + cumsum( is.na( x ) )
   not.na - ! is.na( x )

   f-factor(idx[not.na],levels=1:max(idx))

   split( x[not.na], f )
 }

$`1`
[1] 1 2

$`2`
[1] 1 1 2

$`3`
numeric(0)

$`4`
[1] 4 5 2 3

-tgs

On Thu, Apr 29, 2010 at 4:00 AM, Tal Galili tal.gal...@gmail.com wrote:

 Definitely Smarter,
 Thanks!

 Tal

 Contact
 Details:---
 Contact me: tal.gal...@gmail.com |  972-52-7275845
 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
 www.r-statistics.com (English)

 --




 On Thu, Apr 29, 2010 at 10:56 AM, Romain Francois 
 romain.franc...@dbmail.com wrote:

  Maybe this :
 
   foo - function( x ){
  +   idx - 1 + cumsum( is.na( x ) )
  +   not.na - ! is.na( x )
  +   split( x[not.na], idx[not.na] )
  + }
   foo( x )
 
  $`1`
  [1] 2 1 2
 
  $`2`
  [1] 1 1 2
 
  $`3`
  [1] 4 5 2 3
 
  Romain
 
  Le 29/04/10 09:42, Tal Galili a écrit :
 
 
  Hi all,
 
  I would like to have a function like this:
  split.vec.by.NA- function(x)
 
  That takes a vector like this:
  x- c(2,1,2,NA,1,1,2,NA,4,5,2,3)
 
  And returns a list of length of 3, each element of the list is the
  relevant
  segmented vector, like this:
 
  $`1`
  [1] 2 1 2
  $`2`
  [1] 1 1 2
  $`3`
  [1] 4 5 2 3
 
 
  I found how to do it with a loop, but wondered if there is some smarter
  (vectorized) way of doing it.
 
 
 
  Here is the code I used:
 
  x- c(2,1,2,NA,1,1,2,NA,4,5,2,3)
 
 
  split.vec.by.NA- function(x)
  {
  # assumes NA are seperating groups of numbers
  #TODO: add code to check for it
 
  number.of.groups- sum(is.na(x)) + 1
  groups.end.point.locations- c(which(is.na(x)), length(x)+1) # This
 will
  be
  all the places with NA's + a nubmer after the ending of the vector
   group.start- 1
  group.end- NA
  new.groups.split.id- x # we will replace all the places of the group
  with
  group ID, excapt for the NA, which will later be replaced by 0
   for(i in seq_len(number.of.groups))
  {
  group.end- groups.end.point.locations[i]-1
   new.groups.split.id[group.start:group.end]- i
   group.start- groups.end.point.locations[i]+1 # make the new group
 start
  higher for the next loop (at the final loop it won't matter
   }
   new.groups.split.id[is.na(x)]- 0
   return(split(x, new.groups.split.id)[-1])
  }
 
  split.vec.by.NA(x)
 
 
 
 
  Thanks,
  Tal
 
 
  --
  Romain Francois
  Professional R Enthusiast
  +33(0) 6 28 91 30 30
  http://romainfrancois.blog.free.fr
  |- http://bit.ly/9aKDM9 : embed images in Rd documents
  |- http://tr.im/OIXN : raster images and RImageJ
  |- http://tr.im/OcQe : Rcpp 0.7.7
 
 
 

 [[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Split a vector by NA's - is there a better solution then a loop ?

Re: [R] Split a vector by NA's - is there a better solution then a loop ?

Re: [R] Split a vector by NA's - is there a better solution then a loop ?

Re: [R] Split a vector by NA's - is there a better solution then a loop ?

Re: [R] Split a vector by NA's - is there a better solution then a loop ?

Re: [R] Split a vector by NA's - is there a better solution then a loop ?

6 matches

Site Navigation

Mail list logo

Footer information