On Fri, Jul 24, 2009 at 8:28 AM, Joern Toedling <[email protected]>wrote:

> Hello,
>
> I guess the mentioned functions certainly qualify as 'proper' overlap
> functions. What you probably want is a relatively straightforward
> post-processing of the result. Below is a function that I have written to
> restrict overlapping pairs to those pairs that overlap by at least a
> specified
> fraction of the smaller interval's length. Setting this fraction to 1.0
> will
> only give you pairs in which one of the intervals is contained in the other
> one. This function uses genomeIntervals, but I am sure that post-processing
> the IRanges is equally straightforward.


> Hope this helps,
> Joern
>
>
> fracOverlap <- function(I1, I2, min.frac=1.0){
>  require("genomeIntervals")
>  stopifnot(inherits(I1,"Genome_intervals"),
>            inherits(I1,"Genome_intervals"))
>  ov <- interval_overlap(I1,I2)
>  # get base pair overlap
>  lens <- sapply(ov, length)
>  overlap1 <- rep(1:length(ov), lens)
>  overlap2 <- unlist(ov, use.names=FALSE)
>  left <- pmax(I1[overlap1,1], I2[overlap2,1])
>  right <- pmin(I1[overlap1,2], I2[overlap2,2])
>  stopifnot(all(right >= left))
>  bases <- right-left+1
>  min.len <- pmin(I1[overlap1,2]- I1[overlap1,1]+1,
>                  I2[overlap2,2]- I2[overlap2,1]+1)
>  frac <- round(bases/min.len, digits=2)
>  res <- data.frame("Index1"=overlap1, "Index2"=overlap2,
>                    "n"=bases, "fraction"=frac)
>  res <- subset(res, fraction >= min.frac)
>  return(res)
> }# fracOverlap
>

Approximate IRanges equivalent:
ol <- overlap(a, b)
as.matrix(ol)[width(ranges(ol, b, a)) >= min.frac*width(b),]


>
> On Fri, 24 Jul 2009 16:47:04 +0200, Johannes Waage wrote
> > Hi all,
> >
> > In assigning RNA-seq data to exon-models, I'm looking for a proper
> overlap
> > function. Both IRanges and genomeIntervals have overlap functions,
> > but as far as I can see, these don't have options for contained
> > overlaps, example:
> >
> > |-------Range 1-------]
> >      [----Range 2----]
> >
> > IRanges, genomeIntervals: TRUE
> > Wanted: TRUE
> >
> > |-------Range 1-------]
> >                  [----Range 2----]
> >
> > IRanges, genomeIntervals: TRUE
> > Wanted: FALSE
> >
> > Any suggestions are appreciated!
> >
> > Regards,
> > Johannes Waage,
> > Uni. of Copenhagen
> >
> >       [[alternative HTML version deleted]]
> >
> > _______________________________________________
> > Bioc-sig-sequencing mailing list
> > [email protected]
> > https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>
>
> ---
> Joern Toedling
> Institut Curie -- U900
> 26 rue d'Ulm, 75005 Paris, FRANCE
> Tel. +33 (0)156246926
>
> _______________________________________________
> Bioc-sig-sequencing mailing list
> [email protected]
> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>

        [[alternative HTML version deleted]]

_______________________________________________
Bioc-sig-sequencing mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing

Reply via email to