Thank you both, that elegantly solved my problem!

Regards,
Johannes

On Fri, Jul 24, 2009 at 5:50 PM, Michael Lawrence <[email protected]>wrote:

>
>
> On Fri, Jul 24, 2009 at 8:28 AM, Joern Toedling 
> <[email protected]>wrote:
>
>> Hello,
>>
>> I guess the mentioned functions certainly qualify as 'proper' overlap
>> functions. What you probably want is a relatively straightforward
>> post-processing of the result. Below is a function that I have written to
>> restrict overlapping pairs to those pairs that overlap by at least a
>> specified
>> fraction of the smaller interval's length. Setting this fraction to 1.0
>> will
>> only give you pairs in which one of the intervals is contained in the
>> other
>> one. This function uses genomeIntervals, but I am sure that
>> post-processing
>> the IRanges is equally straightforward.
>
>
>> Hope this helps,
>> Joern
>>
>>
>> fracOverlap <- function(I1, I2, min.frac=1.0){
>>  require("genomeIntervals")
>>  stopifnot(inherits(I1,"Genome_intervals"),
>>            inherits(I1,"Genome_intervals"))
>>  ov <- interval_overlap(I1,I2)
>>  # get base pair overlap
>>  lens <- sapply(ov, length)
>>  overlap1 <- rep(1:length(ov), lens)
>>  overlap2 <- unlist(ov, use.names=FALSE)
>>  left <- pmax(I1[overlap1,1], I2[overlap2,1])
>>  right <- pmin(I1[overlap1,2], I2[overlap2,2])
>>  stopifnot(all(right >= left))
>>  bases <- right-left+1
>>  min.len <- pmin(I1[overlap1,2]- I1[overlap1,1]+1,
>>                  I2[overlap2,2]- I2[overlap2,1]+1)
>>  frac <- round(bases/min.len, digits=2)
>>  res <- data.frame("Index1"=overlap1, "Index2"=overlap2,
>>                    "n"=bases, "fraction"=frac)
>>  res <- subset(res, fraction >= min.frac)
>>  return(res)
>> }# fracOverlap
>>
>
> Approximate IRanges equivalent:
> ol <- overlap(a, b)
> as.matrix(ol)[width(ranges(ol, b, a)) >= min.frac*width(b),]
>
>
>>
>> On Fri, 24 Jul 2009 16:47:04 +0200, Johannes Waage wrote
>> > Hi all,
>> >
>> > In assigning RNA-seq data to exon-models, I'm looking for a proper
>> overlap
>> > function. Both IRanges and genomeIntervals have overlap functions,
>> > but as far as I can see, these don't have options for contained
>> > overlaps, example:
>> >
>> > |-------Range 1-------]
>> >      [----Range 2----]
>> >
>> > IRanges, genomeIntervals: TRUE
>> > Wanted: TRUE
>> >
>> > |-------Range 1-------]
>> >                  [----Range 2----]
>> >
>> > IRanges, genomeIntervals: TRUE
>> > Wanted: FALSE
>> >
>> > Any suggestions are appreciated!
>> >
>> > Regards,
>> > Johannes Waage,
>> > Uni. of Copenhagen
>> >
>> >       [[alternative HTML version deleted]]
>> >
>> > _______________________________________________
>> > Bioc-sig-sequencing mailing list
>> > [email protected]
>> > https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>>
>>
>> ---
>> Joern Toedling
>> Institut Curie -- U900
>> 26 rue d'Ulm, 75005 Paris, FRANCE
>> Tel. +33 (0)156246926
>>
>> _______________________________________________
>> Bioc-sig-sequencing mailing list
>> [email protected]
>> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>>
>
>

        [[alternative HTML version deleted]]

_______________________________________________
Bioc-sig-sequencing mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing

Reply via email to