On Fri, 31 Jan 2014, Benjamin Ward (ENV) wrote:
Hi R helpers,
I have a set of data best shown in this below graph.
Each coloured line represents a statistic calculated across pairs of DNA
sequences. And for each coloured line, I would like to identify
breakpoints - so identify the chunks where the values are high, for
example, in the light blue line, there is a large high segment just
after x=2e+05. From googling the aim to find such points, I've read
about something called change-point analysis, used with time series data
and I wondered if it or a variant of it in R might be of use here, this
data is a series of % values (double), all a single measurement i.e. for
each line, a 'scanner' passed over two sequences and at each step
recorded the % value. Can change-point analysis help me here and if so
what package or method will allow me to do this making as little
assumptions about my data as possible?
The graph didn't make it through but from what you describe it seems that
the tilingArray package on Bioconductor would be helpful for you. See
als Huber et al. (2006, Bioinformatics, 22(16), 1963-1970).
Other useful packages on CRAN include the packages: bcp, changepoint, cpm,
segmented and strucchange (among others).
Thanks in advance,
Ben W.
[X]
[[alternative HTML version deleted]]
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.