[R] How to find series of small numbers in a big vector?
Hello: I have a vector with 120,000 reals between 0.0 and 0. They are not sorted but the vector index is the time-order of my measurements, and therefore cannot be lost. How do I use R to find the starting and ending index of ANY and ALL the series or sequences in that vector where ever there are 5 or more members in a row between 0.021 and 0.029 ? For example: search_range - c (0.021, 0.029) # inclusive searching search_length - 5 # find ALL series of 5 members within search_range my_data - c(0.900, 0.900, 0.900, 0.900, 0.900, 0.900, 0.900, 0.900, 0.900, 0.900, 0.900, 0.028, 0.024, 0.027, 0.023, 0.022, 0.900, 0.900, 0.900, 0.900, 0.900, 0.900, 0.024, 0.029, 0.023, 0.025, 0.026, 0.900, 0.900, 0.900, 0.900, 0.900, 0.900, 0.900, 0.900, 0.900, 0.900, 0.900, 0.900, 0.022, 0.023, 0.025, 0.333, 0.027, 0.028, 0.900, 0.900, 0.900, 0.900, 0.900) I seek the R program to report: start_index of 12 and an end_index of 16 -- and also -- start_index of 23 and an end_index of 27 because that is were there happens to be search_length numbers within my search_range. It should _not_ report the series at start_index 40 because that 0.333 in there violates the search_range. I could brute-force hard-code an R program, but perhaps an expert can give me a tip for an easy, elegant existing function or a tactic to approach? Execution speed or algorithm performance is not, for me in this case, important. Rather, I seek an easy R solution to find the time windows (starting ending indicies) where 5 or more small numbers in my search_range were measured all in a row. Advice welcome and many thanks in advance. Ed Holdgate __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to find series of small numbers in a big vector?
I would try using na.contiguos from package stats. R.utils has seqToIntervals.defaul, which Gets all contigous intervals of a vector of indices. (I didn't use the latter, help.search(contiguous) gave me that name). -- View this message in context: http://www.nabble.com/-R--How-to-find-series-of-small-numbers-in-a-big-vector--tf3141691.html#a8707709 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to find series of small numbers in a big vector?
I suggest the following appraoch This gives TRUE for all data within the search_range A1 = my_data search_range[1] my_data search_range[2] which() gives us the indices A2 = which(A1) and diff() the gaps between those intervals A3 = diff(A2) Hence, if A3 search_length, we have enough consecutive numbers within the search range Finally, this is what you wanted to know? A2[ which(A3 search_length) ] On Mon, 2007-01-29 at 17:49 -0800, Ed Holdgate wrote: Hello: I have a vector with 120,000 reals between 0.0 and 0. They are not sorted but the vector index is the time-order of my measurements, and therefore cannot be lost. How do I use R to find the starting and ending index of ANY and ALL the series or sequences in that vector where ever there are 5 or more members in a row between 0.021 and 0.029 ? For example: search_range - c (0.021, 0.029) # inclusive searching search_length - 5 # find ALL series of 5 members within search_range my_data - c(0.900, 0.900, 0.900, 0.900, 0.900, 0.900, 0.900, 0.900, 0.900, 0.900, 0.900, 0.028, 0.024, 0.027, 0.023, 0.022, 0.900, 0.900, 0.900, 0.900, 0.900, 0.900, 0.024, 0.029, 0.023, 0.025, 0.026, 0.900, 0.900, 0.900, 0.900, 0.900, 0.900, 0.900, 0.900, 0.900, 0.900, 0.900, 0.900, 0.022, 0.023, 0.025, 0.333, 0.027, 0.028, 0.900, 0.900, 0.900, 0.900, 0.900) I seek the R program to report: start_index of 12 and an end_index of 16 -- and also -- start_index of 23 and an end_index of 27 because that is were there happens to be search_length numbers within my search_range. It should _not_ report the series at start_index 40 because that 0.333 in there violates the search_range. I could brute-force hard-code an R program, but perhaps an expert can give me a tip for an easy, elegant existing function or a tactic to approach? Execution speed or algorithm performance is not, for me in this case, important. Rather, I seek an easy R solution to find the time windows (starting ending indicies) where 5 or more small numbers in my search_range were measured all in a row. Advice welcome and many thanks in advance. Ed Holdgate __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to find series of small numbers in a big vector?
At 01:49 30/01/2007, Ed Holdgate wrote: Hello: I have a vector with 120,000 reals between 0.0 and 0. They are not sorted but the vector index is the time-order of my measurements, and therefore cannot be lost. How do I use R to find the starting and ending index of ANY and ALL the series or sequences in that vector where ever there are 5 or more members in a row between 0.021 and 0.029 ? You could look at rle which codes into runs For example: search_range - c (0.021, 0.029) # inclusive searching search_length - 5 # find ALL series of 5 members within search_range my_data - c(0.900, 0.900, 0.900, 0.900, 0.900, 0.900, 0.900, 0.900, 0.900, 0.900, 0.900, 0.028, 0.024, 0.027, 0.023, 0.022, 0.900, 0.900, 0.900, 0.900, 0.900, 0.900, 0.024, 0.029, 0.023, 0.025, 0.026, 0.900, 0.900, 0.900, 0.900, 0.900, 0.900, 0.900, 0.900, 0.900, 0.900, 0.900, 0.900, 0.022, 0.023, 0.025, 0.333, 0.027, 0.028, 0.900, 0.900, 0.900, 0.900, 0.900) I seek the R program to report: start_index of 12 and an end_index of 16 -- and also -- start_index of 23 and an end_index of 27 because that is were there happens to be search_length numbers within my search_range. It should _not_ report the series at start_index 40 because that 0.333 in there violates the search_range. I could brute-force hard-code an R program, but perhaps an expert can give me a tip for an easy, elegant existing function or a tactic to approach? Execution speed or algorithm performance is not, for me in this case, important. Rather, I seek an easy R solution to find the time windows (starting ending indicies) where 5 or more small numbers in my search_range were measured all in a row. Advice welcome and many thanks in advance. Ed Holdgate Michael Dewey http://www.aghmed.fsnet.co.uk __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to find series of small numbers in a big vector?
You can use 'rle' search_range - c (0.021, 0.029) # inclusive searching search_length - 5 # find ALL series of 5 members within search_range my_data - c(0.900, 0.900, 0.900, 0.900, 0.900, + 0.900, 0.900, 0.900, 0.900, 0.900, + 0.900, 0.028, 0.024, 0.027, 0.023, + 0.022, 0.900, 0.900, 0.900, 0.900, + 0.900, 0.900, 0.024, 0.029, 0.023, + 0.025, 0.026, 0.900, 0.900, 0.900, + 0.900, 0.900, 0.900, 0.900, 0.900, + 0.900, 0.900, 0.900, 0.900, 0.022, + 0.023, 0.025, 0.333, 0.027, 0.028, + 0.900, 0.900, 0.900, 0.900, 0.900) # create vector of values within range series - (my_data = search_range[1]) (my_data = search_range[2]) # determine the 'runs' runs - rle(series) # find runs that meet criteria long_runs - which((runs$lengths = search_length) (runs$values)) # create dataframe of indices series - data.frame(start=cumsum(runs$lengths)[long_runs] - runs$lengths[long_runs] + 1, + end=cumsum(runs$lengths)[long_runs]) series start end 112 16 223 27 On 1/30/07, Jonne Zutt [EMAIL PROTECTED] wrote: I suggest the following appraoch This gives TRUE for all data within the search_range A1 = my_data search_range[1] my_data search_range[2] which() gives us the indices A2 = which(A1) and diff() the gaps between those intervals A3 = diff(A2) Hence, if A3 search_length, we have enough consecutive numbers within the search range Finally, this is what you wanted to know? A2[ which(A3 search_length) ] On Mon, 2007-01-29 at 17:49 -0800, Ed Holdgate wrote: Hello: I have a vector with 120,000 reals between 0.0 and 0. They are not sorted but the vector index is the time-order of my measurements, and therefore cannot be lost. How do I use R to find the starting and ending index of ANY and ALL the series or sequences in that vector where ever there are 5 or more members in a row between 0.021 and 0.029 ? For example: search_range - c (0.021, 0.029) # inclusive searching search_length - 5 # find ALL series of 5 members within search_range my_data - c(0.900, 0.900, 0.900, 0.900, 0.900, 0.900, 0.900, 0.900, 0.900, 0.900, 0.900, 0.028, 0.024, 0.027, 0.023, 0.022, 0.900, 0.900, 0.900, 0.900, 0.900, 0.900, 0.024, 0.029, 0.023, 0.025, 0.026, 0.900, 0.900, 0.900, 0.900, 0.900, 0.900, 0.900, 0.900, 0.900, 0.900, 0.900, 0.900, 0.022, 0.023, 0.025, 0.333, 0.027, 0.028, 0.900, 0.900, 0.900, 0.900, 0.900) I seek the R program to report: start_index of 12 and an end_index of 16 -- and also -- start_index of 23 and an end_index of 27 because that is were there happens to be search_length numbers within my search_range. It should _not_ report the series at start_index 40 because that 0.333 in there violates the search_range. I could brute-force hard-code an R program, but perhaps an expert can give me a tip for an easy, elegant existing function or a tactic to approach? Execution speed or algorithm performance is not, for me in this case, important. Rather, I seek an easy R solution to find the time windows (starting ending indicies) where 5 or more small numbers in my search_range were measured all in a row. Advice welcome and many thanks in advance. Ed Holdgate __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.