[R] grouping by consecutive integers
Hello R-helpers! I have a question concerning extracting sequence information from a vector. I have a vector (representing the bins of a time series where the frequency of occurrences is greater than some threshold) where I would like to extract the min, median and max of each group of consecutive numbers. For Example: tmp - c(24,25,29,35,36,37,38,39,40,41,42,43,44,45,46,47,68,69,70,71) I would like to have the max,min,median of the following groups: 24,25 29 35,36,37,38,39,40,41,42,43,44,45,46,47, 68,69,70,71 I would like to be able to perform this for many time series so an automated process would be nice. I am hoping to use this as a peak detection protocol. Any advice would be greatly appreciated, Kevin - - Kevin J Emerson Center for Ecology and Evolutionary Biology 1210 University of Oregon Eugene, OR 97403 USA [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] grouping by consecutive integers
Look at index vectors in the R intro. Best Niels On Mon, 24 Jul 2006, Kevin J Emerson wrote: Hello R-helpers! I have a question concerning extracting sequence information from a vector. I have a vector (representing the bins of a time series where the frequency of occurrences is greater than some threshold) where I would like to extract the min, median and max of each group of consecutive numbers. For Example: tmp - c(24,25,29,35,36,37,38,39,40,41,42,43,44,45,46,47,68,69,70,71) I would like to have the max,min,median of the following groups: 24,25 29 35,36,37,38,39,40,41,42,43,44,45,46,47, 68,69,70,71 I would like to be able to perform this for many time series so an automated process would be nice. I am hoping to use this as a peak detection protocol. Any advice would be greatly appreciated, Kevin - - Kevin J Emerson Center for Ecology and Evolutionary Biology 1210 University of Oregon Eugene, OR 97403 USA [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] grouping by consecutive integers
Dear Kevin, Try something like groups - cut( tmp, c(-Inf, which(diff(tmp) 1 ) + 0.5, Inf) ) Sincerely, Carlos J. Gil Bellosta http://www.datanalytics.com http://www.data-mining-blog.com Quoting Kevin J Emerson [EMAIL PROTECTED]: Hello R-helpers! I have a question concerning extracting sequence information from a vector. I have a vector (representing the bins of a time series where the frequency of occurrences is greater than some threshold) where I would like to extract the min, median and max of each group of consecutive numbers. For Example: tmp - c(24,25,29,35,36,37,38,39,40,41,42,43,44,45,46,47,68,69,70,71) I would like to have the max,min,median of the following groups: 24,25 29 35,36,37,38,39,40,41,42,43,44,45,46,47, 68,69,70,71 I would like to be able to perform this for many time series so an automated process would be nice. I am hoping to use this as a peak detection protocol. Any advice would be greatly appreciated, Kevin - - Kevin J Emerson Center for Ecology and Evolutionary Biology 1210 University of Oregon Eugene, OR 97403 USA [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] grouping by consecutive integers
Let me clarify one thing that I dont think I made clear in my posting. I am looking for the max, min and median of the indicies, not of the time series frequency counts. I am looking to find the max, min, and median time of peaks in a time series, so i am looking for the information concerning that. so mostly my question is how to extract the information of max, min, and median of sequential numbers in a vector. I will reword my original posting below. Hello R-helpers! I have a question concerning extracting sequence information from a vector. I have a vector (representing the bins of a time series where the frequency of occurrences is greater than some threshold) where I would like to extract the min, median and max of each group of consecutive numbers in the index vector.. For Example: tmp - c(24,25,29,35,36,37,38,39,40,41,42,43,44,45,46,47,68,69,70,71) I would like to have the max,min,median of the following groups: 24,25 - max = 25, min = 24 median = 24.5 29 max=min=median = 29 35,36,37,38,39,40,41,42,43,44,45,46,47, max = 45 min = 35 etc... 68,69,70,71 I would like to be able to perform this for many time series so an automated process would be nice. I am hoping to use this as a peak detection protocol. Any advice would be greatly appreciated, Kevin - - Kevin J Emerson Center for Ecology and Evolutionary Biology 1210 University of Oregon Eugene, OR 97403 USA [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] grouping by consecutive integers
As you do not seem to have received what you consider to be satisfactory reply, here is a function that I **think** does what you want: sequences - function(x,incr = 1) { ix - which(abs(diff(c(FALSE,diff(x) == 1))) ==incr) if(length(ix)%%2)c(ix,length(x)) else ix } This function gives successive pairs of first and last values of sequences of increasing values within x that differ by incr. You can then process these pairs however you like either to summarize statistics on the indices and/or the values of the sequences. Examples: sequences(c(1:5,50,3:7)) [1] 1 5 7 11 sequences(c(10,1:5,50,3:7)) [1] 2 6 8 12 sequences(c(1:5,50,3:7,10)) [1] 1 5 7 11 sequences(c(10,1:5,50,3:7,10)) [1] 2 6 8 12 Cheers, -- Bert Gunter Genentech Non-Clinical Statistics South San Francisco, CA The business of the statistician is to catalyze the scientific learning process. - George E. P. Box -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Kevin J Emerson Sent: Monday, July 24, 2006 9:20 AM To: Niels Vestergaard Jensen Cc: r-help@stat.math.ethz.ch Subject: Re: [R] grouping by consecutive integers Let me clarify one thing that I dont think I made clear in my posting. I am looking for the max, min and median of the indicies, not of the time series frequency counts. I am looking to find the max, min, and median time of peaks in a time series, so i am looking for the information concerning that. so mostly my question is how to extract the information of max, min, and median of sequential numbers in a vector. I will reword my original posting below. Hello R-helpers! I have a question concerning extracting sequence information from a vector. I have a vector (representing the bins of a time series where the frequency of occurrences is greater than some threshold) where I would like to extract the min, median and max of each group of consecutive numbers in the index vector.. For Example: tmp - c(24,25,29,35,36,37,38,39,40,41,42,43,44,45,46,47,68,69,70,71) I would like to have the max,min,median of the following groups: 24,25 - max = 25, min = 24 median = 24.5 29 max=min=median = 29 35,36,37,38,39,40,41,42,43,44,45,46,47, max = 45 min = 35 etc... 68,69,70,71 I would like to be able to perform this for many time series so an automated process would be nice. I am hoping to use this as a peak detection protocol. Any advice would be greatly appreciated, Kevin - - Kevin J Emerson Center for Ecology and Evolutionary Biology 1210 University of Oregon Eugene, OR 97403 USA [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] grouping by consecutive integers
This might work: tmp - c(24,25,29,35,36,37,38,39,40,41,42,43,44,45,46,47,68,69,70,71) # generate breaks group - c(0, cumsum(diff(tmp) != 1)) tapply(tmp, group, summary) $`0` Min. 1st Qu. MedianMean 3rd Qu.Max. 24.00 24.25 24.50 24.50 24.75 25.00 $`1` Min. 1st Qu. MedianMean 3rd Qu.Max. 29 29 29 29 29 29 $`2` Min. 1st Qu. MedianMean 3rd Qu.Max. 35 38 41 41 44 47 $`3` Min. 1st Qu. MedianMean 3rd Qu.Max. 68.00 68.75 69.50 69.50 70.25 71.00 On 7/24/06, Kevin J Emerson [EMAIL PROTECTED] wrote: Hello R-helpers! I have a question concerning extracting sequence information from a vector. I have a vector (representing the bins of a time series where the frequency of occurrences is greater than some threshold) where I would like to extract the min, median and max of each group of consecutive numbers. For Example: tmp - c(24,25,29,35,36,37,38,39,40,41,42,43,44,45,46,47,68,69,70,71) I would like to have the max,min,median of the following groups: 24,25 29 35,36,37,38,39,40,41,42,43,44,45,46,47, 68,69,70,71 I would like to be able to perform this for many time series so an automated process would be nice. I am hoping to use this as a peak detection protocol. Any advice would be greatly appreciated, Kevin - - Kevin J Emerson Center for Ecology and Evolutionary Biology 1210 University of Oregon Eugene, OR 97403 USA [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] grouping by consecutive integers: Correction
Sorry, all. My previous post was mixed up. Here's the corrected version: sequences - function(x,incr = 1) { ix - which(abs(diff(c(FALSE,diff(x) == incr))) ==1) if(length(ix)%%2)c(ix,length(x)) else ix } -- Bert Gunter Genentech Non-Clinical Statistics South San Francisco, CA The business of the statistician is to catalyze the scientific learning process. - George E. P. Box -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Kevin J Emerson Sent: Monday, July 24, 2006 9:20 AM To: Niels Vestergaard Jensen Cc: r-help@stat.math.ethz.ch Subject: Re: [R] grouping by consecutive integers Let me clarify one thing that I dont think I made clear in my posting. I am looking for the max, min and median of the indicies, not of the time series frequency counts. I am looking to find the max, min, and median time of peaks in a time series, so i am looking for the information concerning that. so mostly my question is how to extract the information of max, min, and median of sequential numbers in a vector. I will reword my original posting below. Hello R-helpers! I have a question concerning extracting sequence information from a vector. I have a vector (representing the bins of a time series where the frequency of occurrences is greater than some threshold) where I would like to extract the min, median and max of each group of consecutive numbers in the index vector.. For Example: tmp - c(24,25,29,35,36,37,38,39,40,41,42,43,44,45,46,47,68,69,70,71) I would like to have the max,min,median of the following groups: 24,25 - max = 25, min = 24 median = 24.5 29 max=min=median = 29 35,36,37,38,39,40,41,42,43,44,45,46,47, max = 45 min = 35 etc... 68,69,70,71 I would like to be able to perform this for many time series so an automated process would be nice. I am hoping to use this as a peak detection protocol. Any advice would be greatly appreciated, Kevin - - Kevin J Emerson Center for Ecology and Evolutionary Biology 1210 University of Oregon Eugene, OR 97403 USA [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.