Hi Martin and Robert

if you look at arrayQualityMetrics, then consider the function arrayQualityMetrics() as a template that a user should take and modify, to add or remove parts of the report, or insert their additional data preprocessing steps.

        *       *       *       

Btw, right now the reports will only look good on Firefox 4, not perfect on Chrome 10 and not at all on Safari (and I don't know about IE yet).
If anyone has an insight on how to do the moral equivalent of

 ssrules = document.styleSheets[0].cssRules;
 ssrules[i].style.cssText = "stroke-width:3";

in Chrome, I'd be delighted and grateful.

        Best wishes
        Wolfgang


Il Mar/25/11 5:41 PM, Robert Gentleman ha scritto:
On Fri, Mar 25, 2011 at 8:59 AM, Martin Morgan<[email protected]>  wrote:
On 03/24/2011 10:56 AM, Michael Lawrence wrote:

Hi Martin,

It would be nice if the ShortRead QA report could somehow filter out the
adapter contamination before generating the rest of its plots, since those
plots are pretty meaningless if there are adapters present.

Not sure how to handle this filtering in general. That is, what if someone
then wants to see plots with only the "high quality" reads after the
quality
plots. It gets complicated. ShortRead has a nice filtering mechanism, but
this is more complicated, since some QA plots come from one filter, while
others come from a different stage.

However, under the assumption that no one would ever want to align an
adapter, i.e., those reads will not be carried forward, the adapter
removal
could just be treated specially hard-coded. And then just expect more
customized solutions to leverage the internal ShortRead functions for
generating each slot in the QA object, building it up incrementally, on
different subsets. Of course, to make sense, that would require a
different
report template, too.

Hi Michael -- Yes it would be nice to be able to more flexibly control how
different components of the report are generated, or at least to make some
smarter choices along the lines you suggest for adapter contaminants. It's
hard to know how to make this really general, but I have come across other
situations where I'd like to cherry-pick which parts of the QA process I
want to perform. I think I need some standardization on function signatures
for generating each report section, tighter description of results from each
section (i.e., a formal class  hierarchy), and then a flexible report
composition. It seems like quite a big task; I wonder if there are good
models out there to follow? arrayQualityMetrics?

   I think arrayQualityMetrics is a good starting place.  Audrey and
Wolfgang have
done a good job of modularizing the components.  But there are still
hiccups - which
suggests just how hard that is.  And as you suggested, it was a big job.

   I think the case Michael is bringing up might be useful to deal with, without
a major rewrite.  There should be some sort of file that ShortRead has access to
(or an input parameter) that gives some more details on the samples and on the
processing (eg what the sample labels should be, and what the adapters etc are).
Then this information could be used in the current paradigm.

Mostly the issue is that if you have adapter contamination then the
subsequent plots
(eg nucleotide by cycle) are not useful.  You cannot see anything in
them and then
you have to go back and strip adapters by hand, then rerun ShortRead.
I agree that
you may want more general filtering, as an abundance of any read will
affect the plots,
but I think there is agreement that one would never want to include
the adapters (you do want
counts as are produced now, but given their affect on the graphics
filtering would be
beneficial).

   best wishes
     Robert

Martin


Michael

        [[alternative HTML version deleted]]

_______________________________________________
Bioc-sig-sequencing mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing


--
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109

Location: M1-B861
Telephone: 206 667-2793

_______________________________________________
Bioc-sig-sequencing mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing





--


Wolfgang Huber
EMBL
http://www.embl.de/research/units/genome_biology/huber

_______________________________________________
Bioc-sig-sequencing mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing

Reply via email to