[R] How to find series of small numbers in a big vector?

2007-01-30 Thread Ed Holdgate

Hello:

I have a vector with 120,000 reals
between 0.0 and 0.

They are not sorted but the vector index is the 
time-order of my measurements, and therefore
cannot be lost.

How do I use R to find the starting and ending 
index of ANY and ALL the series or sequences 
in that vector where ever there are 5 or more  
members in a row between 0.021 and 0.029 ?

For example:

search_range - c (0.021, 0.029) # inclusive searching
search_length - 5   # find ALL series of 5 members within search_range
my_data - c(0.900, 0.900, 0.900, 0.900, 0.900,
 0.900, 0.900, 0.900, 0.900, 0.900,
 0.900, 0.028, 0.024, 0.027, 0.023,
 0.022, 0.900, 0.900, 0.900, 0.900,
 0.900, 0.900, 0.024, 0.029, 0.023,
 0.025, 0.026, 0.900, 0.900, 0.900,
 0.900, 0.900, 0.900, 0.900, 0.900,
 0.900, 0.900, 0.900, 0.900, 0.022,
 0.023, 0.025, 0.333, 0.027, 0.028,
 0.900, 0.900, 0.900, 0.900, 0.900)

I seek the R program to report: 
start_index of 12 and an end_index of 16
-- and also --
start_index of 23 and an end_index of 27
because that is were there happens to be
search_length numbers within my search_range.

It should _not_ report the series at start_index 40
because that 0.333 in there violates the search_range.

I could brute-force hard-code an R program, but
perhaps an expert can give me a tip for an
easy, elegant existing function or a tactic 
to approach?

Execution speed or algorithm performance is not, 
for me in this case, important.  Rather, I
seek an easy R solution to find the time windows 
(starting  ending indicies) where 5 or more 
small numbers in my search_range were measured 
all in a row.

Advice welcome and many thanks in advance.

Ed Holdgate

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to find series of small numbers in a big vector?

2007-01-30 Thread Vladimir Eremeev

I would try using na.contiguos from package stats.

R.utils has seqToIntervals.defaul, 
which Gets all contigous intervals of a vector of indices.

(I didn't use the latter, help.search(contiguous) gave me that name).
-- 
View this message in context: 
http://www.nabble.com/-R--How-to-find-series-of-small-numbers-in-a-big-vector--tf3141691.html#a8707709
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to find series of small numbers in a big vector?

2007-01-30 Thread Jonne Zutt
I suggest the following appraoch

This gives TRUE for all data within the search_range
A1 = my_data  search_range[1]  my_data  search_range[2]

which() gives us the indices
A2 = which(A1)

and diff() the gaps between those intervals
A3 = diff(A2)

Hence, if A3  search_length, we have enough consecutive numbers within
the search range

Finally, this is what you wanted to know?

A2[ which(A3  search_length) ]


On Mon, 2007-01-29 at 17:49 -0800, Ed Holdgate wrote:
 Hello:
 
 I have a vector with 120,000 reals
 between 0.0 and 0.
 
 They are not sorted but the vector index is the 
 time-order of my measurements, and therefore
 cannot be lost.
 
 How do I use R to find the starting and ending 
 index of ANY and ALL the series or sequences 
 in that vector where ever there are 5 or more  
 members in a row between 0.021 and 0.029 ?
 
 For example:
 
 search_range - c (0.021, 0.029) # inclusive searching
 search_length - 5   # find ALL series of 5 members within search_range
 my_data - c(0.900, 0.900, 0.900, 0.900, 0.900,
  0.900, 0.900, 0.900, 0.900, 0.900,
  0.900, 0.028, 0.024, 0.027, 0.023,
  0.022, 0.900, 0.900, 0.900, 0.900,
  0.900, 0.900, 0.024, 0.029, 0.023,
  0.025, 0.026, 0.900, 0.900, 0.900,
  0.900, 0.900, 0.900, 0.900, 0.900,
  0.900, 0.900, 0.900, 0.900, 0.022,
  0.023, 0.025, 0.333, 0.027, 0.028,
  0.900, 0.900, 0.900, 0.900, 0.900)
 
 I seek the R program to report: 
 start_index of 12 and an end_index of 16
 -- and also --
 start_index of 23 and an end_index of 27
 because that is were there happens to be
 search_length numbers within my search_range.
 
 It should _not_ report the series at start_index 40
 because that 0.333 in there violates the search_range.
 
 I could brute-force hard-code an R program, but
 perhaps an expert can give me a tip for an
 easy, elegant existing function or a tactic 
 to approach?
 
 Execution speed or algorithm performance is not, 
 for me in this case, important.  Rather, I
 seek an easy R solution to find the time windows 
 (starting  ending indicies) where 5 or more 
 small numbers in my search_range were measured 
 all in a row.
 
 Advice welcome and many thanks in advance.
 
 Ed Holdgate
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to find series of small numbers in a big vector?

2007-01-30 Thread Michael Dewey
At 01:49 30/01/2007, Ed Holdgate wrote:

Hello:

I have a vector with 120,000 reals
between 0.0 and 0.

They are not sorted but the vector index is the
time-order of my measurements, and therefore
cannot be lost.

How do I use R to find the starting and ending
index of ANY and ALL the series or sequences
in that vector where ever there are 5 or more
members in a row between 0.021 and 0.029 ?

You could look at rle which codes into runs


For example:

search_range - c (0.021, 0.029) # inclusive searching
search_length - 5   # find ALL series of 5 members within search_range
my_data - c(0.900, 0.900, 0.900, 0.900, 0.900,
  0.900, 0.900, 0.900, 0.900, 0.900,
  0.900, 0.028, 0.024, 0.027, 0.023,
  0.022, 0.900, 0.900, 0.900, 0.900,
  0.900, 0.900, 0.024, 0.029, 0.023,
  0.025, 0.026, 0.900, 0.900, 0.900,
  0.900, 0.900, 0.900, 0.900, 0.900,
  0.900, 0.900, 0.900, 0.900, 0.022,
  0.023, 0.025, 0.333, 0.027, 0.028,
  0.900, 0.900, 0.900, 0.900, 0.900)

I seek the R program to report:
start_index of 12 and an end_index of 16
-- and also --
start_index of 23 and an end_index of 27
because that is were there happens to be
search_length numbers within my search_range.

It should _not_ report the series at start_index 40
because that 0.333 in there violates the search_range.

I could brute-force hard-code an R program, but
perhaps an expert can give me a tip for an
easy, elegant existing function or a tactic
to approach?

Execution speed or algorithm performance is not,
for me in this case, important.  Rather, I
seek an easy R solution to find the time windows
(starting  ending indicies) where 5 or more
small numbers in my search_range were measured
all in a row.

Advice welcome and many thanks in advance.

Ed Holdgate

Michael Dewey
http://www.aghmed.fsnet.co.uk

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to find series of small numbers in a big vector?

2007-01-30 Thread jim holtman
You can use 'rle'

 search_range - c (0.021, 0.029) # inclusive searching
 search_length - 5   # find ALL series of 5 members within search_range
 my_data - c(0.900, 0.900, 0.900, 0.900, 0.900,
+ 0.900, 0.900, 0.900, 0.900, 0.900,
+ 0.900, 0.028, 0.024, 0.027, 0.023,
+ 0.022, 0.900, 0.900, 0.900, 0.900,
+ 0.900, 0.900, 0.024, 0.029, 0.023,
+ 0.025, 0.026, 0.900, 0.900, 0.900,
+ 0.900, 0.900, 0.900, 0.900, 0.900,
+ 0.900, 0.900, 0.900, 0.900, 0.022,
+ 0.023, 0.025, 0.333, 0.027, 0.028,
+ 0.900, 0.900, 0.900, 0.900, 0.900)
 # create vector of values within range
 series - (my_data = search_range[1])  (my_data = search_range[2])
 # determine the 'runs'
 runs - rle(series)
 # find runs that meet criteria
 long_runs - which((runs$lengths = search_length)  (runs$values))
 # create dataframe of indices
 series - data.frame(start=cumsum(runs$lengths)[long_runs] - 
 runs$lengths[long_runs] + 1,
+ end=cumsum(runs$lengths)[long_runs])
 series
  start end
112  16
223  27



On 1/30/07, Jonne Zutt [EMAIL PROTECTED] wrote:
 I suggest the following appraoch

 This gives TRUE for all data within the search_range
A1 = my_data  search_range[1]  my_data  search_range[2]

 which() gives us the indices
A2 = which(A1)

 and diff() the gaps between those intervals
A3 = diff(A2)

 Hence, if A3  search_length, we have enough consecutive numbers within
 the search range

 Finally, this is what you wanted to know?

A2[ which(A3  search_length) ]


 On Mon, 2007-01-29 at 17:49 -0800, Ed Holdgate wrote:
  Hello:
 
  I have a vector with 120,000 reals
  between 0.0 and 0.
 
  They are not sorted but the vector index is the
  time-order of my measurements, and therefore
  cannot be lost.
 
  How do I use R to find the starting and ending
  index of ANY and ALL the series or sequences
  in that vector where ever there are 5 or more
  members in a row between 0.021 and 0.029 ?
 
  For example:
 
  search_range - c (0.021, 0.029) # inclusive searching
  search_length - 5   # find ALL series of 5 members within search_range
  my_data - c(0.900, 0.900, 0.900, 0.900, 0.900,
   0.900, 0.900, 0.900, 0.900, 0.900,
   0.900, 0.028, 0.024, 0.027, 0.023,
   0.022, 0.900, 0.900, 0.900, 0.900,
   0.900, 0.900, 0.024, 0.029, 0.023,
   0.025, 0.026, 0.900, 0.900, 0.900,
   0.900, 0.900, 0.900, 0.900, 0.900,
   0.900, 0.900, 0.900, 0.900, 0.022,
   0.023, 0.025, 0.333, 0.027, 0.028,
   0.900, 0.900, 0.900, 0.900, 0.900)
 
  I seek the R program to report:
  start_index of 12 and an end_index of 16
  -- and also --
  start_index of 23 and an end_index of 27
  because that is were there happens to be
  search_length numbers within my search_range.
 
  It should _not_ report the series at start_index 40
  because that 0.333 in there violates the search_range.
 
  I could brute-force hard-code an R program, but
  perhaps an expert can give me a tip for an
  easy, elegant existing function or a tactic
  to approach?
 
  Execution speed or algorithm performance is not,
  for me in this case, important.  Rather, I
  seek an easy R solution to find the time windows
  (starting  ending indicies) where 5 or more
  small numbers in my search_range were measured
  all in a row.
 
  Advice welcome and many thanks in advance.
 
  Ed Holdgate
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.