On Tue, Jun 8, 2010 at 5:10 PM, Janet Young <[email protected]> wrote:

> Hi,
>
> This is probably a simple question, but I can't quite figure it out for
> myself.  I've been using RangedData (etc) quite a lot and have often found
> myself looping through the spaces when I probably don't need to. I'm trying
> not to do that now...
>
> I have some scores on a bunch of genomic intervals I'm interested in
> (they're conservation scores in 4kb promoter regions, e.g. phastCons
> scores).  I've stored these as a SimpleRleList object.
>
> I also have some smaller subregions (matches to transcription factor
> motifs) and I want to compare the distribution of scores inside those
> subregions to scores outside those subregions.  I have the regions as
> RangedData objects, over multiple spaces. I want to do this on individual
> base pair scores, rather than by taking subregion means and comparing those
> (so I don't want to use viewSums).
>
> I've managed to get the scores inside and outside the regions using the
> RleViewsList function (following the example in this thread:
> http://thread.gmane.org/gmane.comp.lang.r.sequencing/403/focus=408 ).  The
> scores are now as SimpleRleViewsList.
>
> I'm currently looping through the scores (the SimpleRleViewsList) space by
> space, converting to simple vector and concatenating the scores, but I
> imagine there's a way to do it without looping. Any ideas?
>
> Below is a simple example (extended from the older thread).  My real scores
> are not integers and there's many more of them.
>
> #########################
> library(IRanges)
>
> ### generate some arbitrary scores
> track <- RangedData(RangesList(chrA = IRanges(start = c(1, 4, 6),
> width=c(3, 2, 4)),chrB = IRanges(start = c(1, 3, 6), width=c(3, 3, 4))) )
> trackCoverage <- coverage(track, weight=list(chrA=c(2,7,3),chrB=c(1,1,1)) )
>
> ### define subregions
> exons <- RangesList(chrA = IRanges(start = c(2, 4), width = c(2, 2)),chrB =
> IRanges(start = 3, width = 5))
>
>
Views are not necessary in your case. Just seqselect() off of the RleList
with a RangesList. This all works as one would expect.

seqselect(trackCoverage, exons)

That gives you an RleList. That can be unlist()'d to an Rle.

In general, it does not make sense to unlist a list of views, like
RleViewsList, since a Views can contain only a single subject.

Michael


> ###get exon and intron scores
> exonView <- RleViewsList(rleList=trackCoverage, rangesList=exons)
> intronView <- RleViewsList(rleList=trackCoverage,
> rangesList=gaps(ranges(RangedData(exons)),start=1,end=9))
>
>
### now convert those to simple integer objects.  Loop through chromosomes.
> Can I do this without looping?
> ### (I tried using unlist, but it does something weird to the scores, as it
> assumes they're all from the same space)
> exonScores <- integer()
> intronScores <- integer()
> for (chr in names(exonView) ) {
>     thischrexonscores <- as.integer(seqselect(trackCoverage[[chr]],
> start=start(exonView[[chr]]),end=end(exonView[[chr]]) ))
>     exonScores <- c(exonScores,thischrexonscores)
>
>     thischrintronscores <- as.integer(seqselect(trackCoverage[[chr]],
> start=start(intronView[[chr]]),end=end(intronView[[chr]]) ))
>     intronScores <- c(intronScores,thischrintronscores)
>
> }
>
> ###and do something with the scores to compare distributions
> wilcox.test(exonScores,intronScores)
>
> #########################
>
> thanks,
>
> Janet
>
> -------------------------------------------------------------------
>
> Dr. Janet Young (Trask lab)
>
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Avenue N., C3-168,
> P.O. Box 19024, Seattle, WA 98109-1024, USA.
>
> tel: (206) 667 1471 fax: (206) 667 6524
> email: jayoung  ...at...  fhcrc.org
>
> http://www.fhcrc.org/labs/trask/
>
> _______________________________________________
> Bioc-sig-sequencing mailing list
> [email protected]
> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>

        [[alternative HTML version deleted]]

_______________________________________________
Bioc-sig-sequencing mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing

Reply via email to