Re: [R] Change points in R

2014-01-31 Thread Achim Zeileis

On Fri, 31 Jan 2014, Benjamin Ward (ENV) wrote:


Hi R helpers,

I have a set of data best shown in this below graph.

Each coloured line represents a statistic calculated across pairs of DNA 
sequences. And for each coloured line, I would like to identify 
breakpoints - so identify the chunks where the values are high, for 
example, in the light blue line, there is a large high segment just 
after x=2e+05. From googling the aim to find such points, I've read 
about something called change-point analysis, used with time series data 
and I wondered if it or a variant of it in R might be of use here, this 
data is a series of % values (double), all a single measurement i.e. for 
each line, a 'scanner' passed over two sequences and at each step 
recorded the % value. Can change-point analysis help me here and if so 
what package or method will allow me to do this making as little 
assumptions about my data as possible?


The graph didn't make it through but from what you describe it seems that 
the tilingArray package on Bioconductor would be helpful for you. See 
als Huber et al. (2006, Bioinformatics, 22(16), 1963-1970).


Other useful packages on CRAN include the packages: bcp, changepoint, cpm, 
segmented and strucchange (among others).



Thanks in advance,

Ben W.

[X]

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Change points in R

2014-01-30 Thread Benjamin Ward (ENV)
Hi R helpers,

I have a set of data best shown in this below graph.

Each coloured line represents a statistic calculated across pairs of DNA 
sequences. And for each coloured line, I would like to identify breakpoints - 
so identify the chunks where the values are high, for example, in the light 
blue line, there is a large high segment just after x=2e+05. From googling the 
aim to find such points, I've read about something called change-point 
analysis, used with time series data and I wondered if it or a variant of it in 
R might be of use here, this data is a series of % values (double), all a 
single measurement i.e. for each line, a 'scanner' passed over two sequences 
and at each step recorded the % value. Can change-point analysis help me here 
and if so what package or method will allow me to do this making as little 
assumptions about my data as possible?

Thanks in advance,

Ben W.

 [X]

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.