P.,
The error message from aggregate isn't very informative and I'll clean it up.

The aggregate function threw an error for the cov.y object because the ranges in allPeaks referenced indices outside of the bounds of cov.y, in particular cov.y is an Rle of length 11 and allPeaks included the interval [17, 19]. If you know the length of underlying sequence, you can pass that into the width argument to the coverage function. For example, if the underlying sequence is of length 19, then the coverage from the y ranges would be calculated as shown below. (I also added code for more efficient summation withing the specified ranges.)

> cov.y <- coverage(y, width = 19)
> cov.y
'integer' Rle of length 19 with 5 runs
 Lengths:  3 2 4 2 8
 Values :  0 3 0 3 0
> y.counts <- aggregate(cov.y, allPeaks, sum)
> y.counts
[1] 6 0
> y.counts.efficient <- viewSums(Views(cov.y, allPeaks))
> y.counts.efficient
[1] 6 0
> sessionInfo()
R version 2.10.1 Patched (2009-12-14 r50738)
i386-apple-darwin9.8.0

locale:
[1] C/en_US.UTF-8/C/C/C/C

attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] IRanges_1.4.9

loaded via a namespace (and not attached):
[1] tools_2.10.1


Cheers,
Patrick


[email protected] wrote:
Dear bioc-sig-sequencing,

I am working with a toy example to learn the material covered in part 3 
(Differential expression, pages 10-11) of 'A ChIP-Seq Data Analysis' handout 
for a 11/19/09 session at the 'High throughput sequence analysis tools and 
approaches with Bioconductor' workshop in Seattle.

I generated an error message in the following output.  Can you comment?
(I note that when I use the sample data & code from the handout, ctcf.rda & 
gfp.rda, no errors are generated)

x <- IRanges(start=c(1L, 9L, 4L, 1L, 5L, 10L, 15L, 17L, 17L),
+                     width=c(5L, 6L, 3L, 4L, 3L, 3L, 5L, 3L, 3L))

y <- IRanges(start=c(4L, 4L, 4L, 10L, 10L, 10L),
+                     width=c(2L, 2L, 2L, 2L, 2L, 2L))

cov.x <- coverage(x)
cov.y <- coverage(y)
allPeaks <- slice(cov.x, lower = 3)
allPeaks
Views on a 19-length Rle subject

views:
    start end width
[1]     4   5     2 [3 3]
[2]    17  19     3 [3 3 3]
x.counts <- aggregate(cov.x, allPeaks, sum)
x.counts
[1] 6 9
y.counts <- aggregate(cov.y, allPeaks, sum)
Error in findIntervalAndStartFromWidth(start, runLength(x)) :
  'x' must be less than 'sum(width)'

sessionInfo()
R version 2.10.1 (2009-12-14)
x86_64-pc-linux-gnu

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=C              LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] ChIPseqTutorial_0.0.1              BSgenome.Mmusculus.UCSC.mm9_1.3.16
[3] chipseq_0.2.0                      ShortRead_1.4.0
[5] lattice_0.17-26                    BSgenome_1.14.0
[7] Biostrings_2.14.1                  IRanges_1.4.2

loaded via a namespace (and not attached):
[1] Biobase_2.6.0 grid_2.10.1   hwriter_1.1
Thanks,
P. Terry
huskers.unl.edu

_______________________________________________
Bioc-sig-sequencing mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing


_______________________________________________
Bioc-sig-sequencing mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing

Reply via email to