Thanks for the responses. On Mon, Oct 19, 2009, Deepayan wrote: > for most downstream operations the default choice of 'width' > seemed OK.Could you explain why that is not sufficient in your case?
I didn't know there was a default choice. I thought that without a 'width' argument, coverage is only calculated between 0 and the right-most read (i.e. highest integer position). For instance, I create a GenomeData object from a set of aligned reads: > str(aln.gd$chr2L) List of 2 $ -: int [1:538743] 8223524 17606696 [...] 2021681 21292267 ... $ +: int [1:542167] 22057110 2479626 [...] 17195586 21698792 ... handing this to extendReads(), will extend those positions, but no where is it known that this is drosophila data or that chr2L actually has 23,011,544 bases. So handing the extended reads to coverage(), would only calculate coverage between 0 and the highest extended read (or so I thought). I want to extend the reads to the ends of the chromosomes, because what I actually want to do is to calculate coverage for an IP sample, and a Whole Cell Extract sample, and then divide the two (perhaps after adding 1 to each position to avoid dividing by zero) to create a ratio - and then export to WIG. e.g. where ypd is a GenomeDataList object: ip1 <- extendReads(ypd$ip$chrI, seqLen=200) wce1 <- extendReads(ypd$wce$chrI, seqLen=200) ip1.cov <- coverage(ip1, width=230208) wce1.cov <- coverage(wce1, width=230208) ip1.cov <- ip1.cov + 1 wce1.cov <- wce1.cov + 1 rat <- log2(ip1.cov/wce1.cov) export(rat, "ip1_wce1_rat.wig") (not perfect, but goes a long way to exploring the data.) The ratios represent signal over background, and one can investigate normalization, smoothing, peaks, etc. I've figured out how to do all these things with single chromosomes, but as Michael has pointed out, it's much more natural (and elegant) to do this for all the chromosomes at once. -Chris Deepayan wrote: > On Mon, Oct 19, 2009 at 10:25 AM, Chris Seidel <seidel at phaget4.org> wrote: > > In using gdapply to get coverage over all chromosomes of a > > GenomeDataList object: > > > > cov <- gdapply(extendedReads, coverage) > > > > how do you supply the "width" argument to coverage? When I try handing > > it a vector of chromosome lengths, it complains that it needs a single > > value. > > You cannot. That's a limitation of gdapply, potentially fixable by > making the chromosome lengths available as phenodata associated with > the "GenomeDataList" object. > > One reason for not addressing this limitation yet is that for most > downstream operations the default choice of 'width' seemed OK. Could > you explain why that is not sufficient in your case? > > -Deepayan _______________________________________________ Bioc-sig-sequencing mailing list [email protected] https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
