On Wed, Mar 2, 2011 at 9:58 AM, Martin Morgan <[email protected]> wrote: > On 03/01/2011 04:44 AM, Michael Lawrence wrote: >> Hi guys, >> >> What are the plans for the BamViews class. It looks like a useful >> foundation. One thing that would be good to have in R is a way to calculate >> "pileups" or base tallies for positions of interest. These counts could be >> broken down by sample (bamfile), cycle (position in the read), etc. Results >> returned as a DataFrame (in a format like that returned by as.data.frame on >> a table) that could be aggregated() up as desired. Rles would save memory. >> So there could be something like a alphabetFrequency() method for BamViews. >> This is related to Steve's recent work with counting over XStringSets. > > Hi Michael -- BamViews is definitely open for more development. The > methods currently implemented (minimal!) basically dispatch to > single-bam variants. And I guess there is no single-bam variant of what > you're looking for. > > Another possibility is to expose more of samtools, e.g., pileup / > mpileup, which might be returned more or less directly for manipulation > in R, or summarized. I'll work on this in the 3 week time frame (sorry)
exposition of pileup/mpileup was what occurred to me also. i would hope it is not premature to express some concern with the downstream container for the outputs of these things. we have a pileup-output parser which delivers a GRanges and that is probably adequate, although decoding the pileup string might be a useful added value. mpileup delivers VCF/BCF and while we can scan these, some of the structures returned can only be interpreted by checking some file specification and it would be good to have some downstream data modeling based on use cases, that the mpileup interface could target. such developments could be important for the ISMB tutorial so i will be thinking more about this in coming weeks. > > Maybe Herve will weigh in on Steve's XStringSet sliding window > letterFrequencyAt > > Martin > >> >> Surely there are many other features that could be added. The above is just >> one that I would use often, across a number of contexts. >> >> Thanks, >> Michael >> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioc-sig-sequencing mailing list >> [email protected] >> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing > > > -- > Computational Biology > Fred Hutchinson Cancer Research Center > 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 > > Location: M1-B861 > Telephone: 206 667-2793 > > _______________________________________________ > Bioc-sig-sequencing mailing list > [email protected] > https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing > _______________________________________________ Bioc-sig-sequencing mailing list [email protected] https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
