Here is an example where the order changes between the GRanges object and the countBam data.frame
> aa GRanges with 5 ranges and 2 elementMetadata values seqnames ranges strand | tx_id <Rle> <IRanges> <Rle> | <integer> ENSMUSG00000000001.1 chr3 [107910199, 107949007] - | 9523 ENSMUSG00000000003.2 chrX [ 75083240, 75098865] - | 29303 ENSMUSG00000000028.4 chr16 [ 18780546, 18811724] - | 43866 ENSMUSG00000000028.6 chr16 [ 18780546, 18812065] - | 43865 ENSMUSG00000000028.7 chr16 [ 18807449, 18812080] - | 43868 tx_name <character> ENSMUSG00000000001.1 ENSMUST00000000001 ENSMUSG00000000003.2 ENSMUST00000000003 ENSMUSG00000000028.4 ENSMUST00000115586 ENSMUSG00000000028.6 ENSMUST00000000028 ENSMUSG00000000028.7 ENSMUST00000115585 > countBam("bowtie_aligned.prefix.bam",param=ScanBamParam(which=aa)) space start end width file records nucleotides 1 chr3 107910199 107949007 38809 bowtie_aligned.prefix.bam 171 8550 2 chr16 18780546 18811724 31179 bowtie_aligned.prefix.bam 16 800 3 chr16 18780546 18812065 31520 bowtie_aligned.prefix.bam 17 850 4 chr16 18807449 18812080 4632 bowtie_aligned.prefix.bam 2 100 5 chrX 75083240 75098865 15626 bowtie_aligned.prefix.bam 0 0 It seems that all that is happening is GRanges is sorted by the seqnames column, but as I don't know the internals of the function I don't really want to just assume that. Having the output of countBam be a GRanges instance seems to make more sense to me, but perhaps that would not be the case for other use cases. On 20/04/2010, at 11:18 AM, Vincent Carey wrote: > The leeBamViews package might serve as a moderately rich source of > use case details for clarifying concerns about current > infrastructure. I can't get a clear example of your "problem" in > which the order of countBam output does not agree with ranges > input. Specifically, with > > library(leeBamViews) > example(tabulateReads) # written before countBam existed > bs2 = bs1 > br2 = bamRanges(bs1)[c(2,1,3,4)] > bamRanges(bs2) = br2 > countBam(bs2)[[1]] > > the order of the output ranges agrees with that of br2 (AFAICT). It > is true that additional elementMetadata items that might be present > in the bamRanges component of the input (are you using that?) are > not propagated, and perhaps that should be remedied -- but it could > be a nuisance to propagate anything that is found there. > > In fact, I wonder whether the outputs of countBam should be a > data.frames, or GRanges instances? > > > On Mon, Apr 19, 2010 at 8:30 PM, Matthew Young <myo...@wehi.edu.au> > wrote: > Hi, > > I'm trying to use Rsamtools to bin reads into genes using the > countBam function. I have a GRanges object which defines the range > of the annotation and has additional information in the > "elementMetadata" field, for example like this: > > > aa > GRanges with 3 ranges and 2 elementMetadata values > seqnames ranges strand | tx_id tx_name > <Rle> <IRanges> <Rle> | <integer> <character> > [1] chr16 [18780546, 18811724] - | 43866 > ENSMUST00000115586 > [2] chr16 [18780546, 18812065] - | 43865 > ENSMUST00000000028 > [3] chr16 [18807449, 18812080] - | 43868 > ENSMUST00000115585 > > I want to use the countBam function to count the number of reads > that occur within each annotation range, which I do with the > following: > > > countBam("bowtie_aligned.prefix.bam",param=ScanBamParam(which=aa)) > space start end width file records > nucleotides > 1 chr16 18780546 18811724 31179 bowtie_aligned.prefix.bam > 16 800 > 2 chr16 18780546 18812065 31520 bowtie_aligned.prefix.bam > 17 850 > 3 chr16 18807449 18812080 4632 bowtie_aligned.prefix.bam > 2 100 > > My question is, is there a way to get the "elementMetadata" to come > along for the ride in the countBam output? So there'd be another > two columns in the countBam output, "tx_id" and "tx_name". The > problem is that the ranges do not always appear in the same order in > the GRanges input and the countBam output, so to recover this > information it is necessary to use match and pull information out of > the GRanges object, which while doable, is cumbersome. Is there > some built in way to do this that I'm overlooking? > > Thanks, > > Matt > > > ______________________________________________________________________ > The information in this email is confidential and inte...{{dropped:18}} _______________________________________________ Bioc-sig-sequencing mailing list Bioc-sig-sequencing@r-project.org https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing