Thanks Martin. The reason I don't want to use (qa/report) functions here is because I want to display avg base quality scores from two different runs on the same graph for the sake of easy visual comparison. In case this is possible with (qa/report) please correct me.
Also you mentioned about quality score conversion, could you please elaborate on what the following function is doing mathematically. I understand this is a bit off track question. FastqQuality(quality(quality(rfq))) Thanks, -Abhi -----Original Message----- From: Martin Morgan [mailto:[email protected]] Sent: Monday, August 03, 2009 4:41 PM To: Pratap, Abhishek Cc: Johannes Waage; [email protected] Subject: Re: [Bioc-sig-seq] Calculating/plotting Avg Base Quality Pratap, Abhishek wrote: > Thanks for a quick revert. Kind of an expected question, when you talk > about quality format, by default is it the Phred one ? I will be > importing the FASTQ from Illumina Eland output which is 0-40 scoring > window. Does that gets automatically converted ? fastq files don't contain information on how qualities were encoded; readFastq isn't smart about this and assumes they're Solexa. If that's not correct, you can change to phred with FastqQuality(quality(quality(rfq))) and vice-versa. Also recent Solexa encodings have changed their calculation (not their encoding) to use use something closer to phred rather than log-odds. This has consequence for low quality scores. You might also look at qa() and report() in the ShortRead package, to generate a quality report that contains this information. The idea is to do this in two stages qastats <- qa(dirPath, pattern) # slow! report(qastats) # fast see ?qa and ?report. Martin > > > > Thanks, > > -Abhi > > > > From: [email protected] [mailto:[email protected]] On > Behalf Of Johannes Waage > Sent: Monday, August 03, 2009 4:04 PM > To: Pratap, Abhishek; [email protected] > Subject: Re: [Bioc-sig-seq] Calculating/plotting Avg Base Quality > > > > Try this: > > library(ShortRead) > my_reads<- readFastq("~/", pattern="my_reads.fastq") > qualities <- FastqQuality(quality(my_reads)) > qualities <- as(qualities, "matrix") > boxplot(as.data.frame((qualities )), outline = F, xlab="Cycle", > ylab="Quality") > > Depending on quality format, you might have to correct your numeric > values in the matrix, or even convert to base call probabilities. > > Regards, > Johannes Waage, > Uni of copenhagen > > On Mon, Aug 3, 2009 at 9:41 PM, Pratap, Abhishek > <[email protected]> wrote: > > Hi All > > > > Just wondering if there is a R function in any package to do the > following. Want to make sure before I write something. > > > > We would like to calculate avg base quality score for each base called > per cycle / lane. What will be nice is to also plot these avg quality > scores/lane for couple of different runs. > > > > Thanks, > > -Abhi > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioc-sig-sequencing mailing list > [email protected] > https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing > > > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioc-sig-sequencing mailing list > [email protected] > https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing _______________________________________________ Bioc-sig-sequencing mailing list [email protected] https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
