On Tue, Jun 8, 2010 at 5:10 PM, Janet Young <[email protected]> wrote:
> Hi, > > This is probably a simple question, but I can't quite figure it out for > myself. I've been using RangedData (etc) quite a lot and have often found > myself looping through the spaces when I probably don't need to. I'm trying > not to do that now... > > I have some scores on a bunch of genomic intervals I'm interested in > (they're conservation scores in 4kb promoter regions, e.g. phastCons > scores). I've stored these as a SimpleRleList object. > > I also have some smaller subregions (matches to transcription factor > motifs) and I want to compare the distribution of scores inside those > subregions to scores outside those subregions. I have the regions as > RangedData objects, over multiple spaces. I want to do this on individual > base pair scores, rather than by taking subregion means and comparing those > (so I don't want to use viewSums). > > I've managed to get the scores inside and outside the regions using the > RleViewsList function (following the example in this thread: > http://thread.gmane.org/gmane.comp.lang.r.sequencing/403/focus=408 ). The > scores are now as SimpleRleViewsList. > > I'm currently looping through the scores (the SimpleRleViewsList) space by > space, converting to simple vector and concatenating the scores, but I > imagine there's a way to do it without looping. Any ideas? > > Below is a simple example (extended from the older thread). My real scores > are not integers and there's many more of them. > > ######################### > library(IRanges) > > ### generate some arbitrary scores > track <- RangedData(RangesList(chrA = IRanges(start = c(1, 4, 6), > width=c(3, 2, 4)),chrB = IRanges(start = c(1, 3, 6), width=c(3, 3, 4))) ) > trackCoverage <- coverage(track, weight=list(chrA=c(2,7,3),chrB=c(1,1,1)) ) > > ### define subregions > exons <- RangesList(chrA = IRanges(start = c(2, 4), width = c(2, 2)),chrB = > IRanges(start = 3, width = 5)) > > Views are not necessary in your case. Just seqselect() off of the RleList with a RangesList. This all works as one would expect. seqselect(trackCoverage, exons) That gives you an RleList. That can be unlist()'d to an Rle. In general, it does not make sense to unlist a list of views, like RleViewsList, since a Views can contain only a single subject. Michael > ###get exon and intron scores > exonView <- RleViewsList(rleList=trackCoverage, rangesList=exons) > intronView <- RleViewsList(rleList=trackCoverage, > rangesList=gaps(ranges(RangedData(exons)),start=1,end=9)) > > ### now convert those to simple integer objects. Loop through chromosomes. > Can I do this without looping? > ### (I tried using unlist, but it does something weird to the scores, as it > assumes they're all from the same space) > exonScores <- integer() > intronScores <- integer() > for (chr in names(exonView) ) { > thischrexonscores <- as.integer(seqselect(trackCoverage[[chr]], > start=start(exonView[[chr]]),end=end(exonView[[chr]]) )) > exonScores <- c(exonScores,thischrexonscores) > > thischrintronscores <- as.integer(seqselect(trackCoverage[[chr]], > start=start(intronView[[chr]]),end=end(intronView[[chr]]) )) > intronScores <- c(intronScores,thischrintronscores) > > } > > ###and do something with the scores to compare distributions > wilcox.test(exonScores,intronScores) > > ######################### > > thanks, > > Janet > > ------------------------------------------------------------------- > > Dr. Janet Young (Trask lab) > > Fred Hutchinson Cancer Research Center > 1100 Fairview Avenue N., C3-168, > P.O. Box 19024, Seattle, WA 98109-1024, USA. > > tel: (206) 667 1471 fax: (206) 667 6524 > email: jayoung ...at... fhcrc.org > > http://www.fhcrc.org/labs/trask/ > > _______________________________________________ > Bioc-sig-sequencing mailing list > [email protected] > https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing > [[alternative HTML version deleted]] _______________________________________________ Bioc-sig-sequencing mailing list [email protected] https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
