Hi there, may be I am missing something obvious in my code. I extracted information from a bam file on 11 regions. when I calculate the coverage on one of these regions, I get a vector which is much longer than the number of bases of the region itself.
I tried several times to replicate this problem on a small IRanges object but - of course - everything was ok. Anyone has an idea? that's what I've done: > regions chr start end 1 1 195915909 197867662 2 2 1199920 2234892 [...] scanned my bam within those regions which=GRanges(regions$chr,IRanges(regions$start,regions$end)) ggf<-scanBam("s_7_1_sequence.txt.novo.rmdup.bam_sorted.bam", param=ScanBamParam(which=which, what = c("pos", "qwidth"), flag =scanBamFlag(isUnmappedQuery =FALSE))) created a summary function, as suggested in the IRanges vignette summaryFunction<-function(seqname,bamfile,...){ x<-bamfile[[seqname]] coverage(IRanges(x[["pos"]],width=x[["qwidth"]])) } applied to get a list of coverage for each region seqnames<-names(ggf) cvg<-lapply(seqnames,summaryFunction,ggf) names(cvg)=seqnames then I extracted one of those regions, and computed the vector coverage areacov<-cvg[["14:20404091-35890861"]] areacov.vect<-as.vector(areacov) however positions=seq(from=20404091,to=35890861,by=1) > areacov 'integer' Rle of length 35890857 with 445958 runs Lengths: 20404015 1 1 1 ... 25 121 76 Values : 0 2 5 6 ... 1 0 1 > length(positions) [1] 15486771 as you can see from here, the Rle has a length of 35890857 even considering that my area is a bit larger, 'cause it gets the reads positions > min(ggf[["14:20404091-35890861"]][["pos"]]) [1] 20404016 > max(ggf[["14:20404091-35890861"]][["pos"]]) [1] 35890782 I still have a coverage vector much longer then expected. I am surely doing something wrong, but the procedure seemed straigthforward. Any clue to help? cheers, Francesco -------------------------------------- Francesco Lescai, PhD Research Associate in Genome Analysis Faculty of Biomedical Sciences, Division of Research Strategy c/o UCL Cancer Institute, Paul O'Gorman Building University College London Gower Street WC1E 6BT London, UK visiting address: 72, Huntley Street WC1E 6DE London email: f.les...@ucl.ac.uk phone: +44.(0)20.7679.6812 [ext: 46812] -------------------------------------- _______________________________________________ Bioc-sig-sequencing mailing list Bioc-sig-sequencing@r-project.org https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing