Hi Ivan, You are skipping over one part of the pipeline, that is using the ShortRead package to read in your data and perform some sort of QA. The output will be the aligned reads. But you should take the diagnostics seriously, we find lots of problems that need to be caught early so that the downstream analyses are reasonable.
As for how does one justify discarding duplicate reads, why not ask it the other way around? How does one justify keeping them? And in either case, one thing to do is to try to decide if those duplicate reads represent biological replicates (ie the same piece of DNA was selected twice), or if they are more likely to represent PCR artifacts. If the former, then I would keep them, if the latter, then I would discard them. For the example given, it is the latter, best wishes Robert [email protected] wrote: > Hello, > > In preparation to analyse my own ChIP-seq data, I am trying to follow the > steps described in this sample workflow: > > http://www.bioconductor.org/workshops/2008/SeattleNov08/ChIP-seq/workflow.pdf > > The document starts by loading data that has been "reduced to a set of > alignment start positions (including orientation)". > > Can somebody elaborate on that a little bit or, ideally, show it with one > example? > > Also, as part of the reduction, the procedure "removed all duplicate reads > and applied a quality score cutoff". The score cutoff is fine but how is > removing duplicates justified? > > Thank you, > > Ivan > > > > > _______________________________________________ > Bioc-sig-sequencing mailing list > [email protected] > https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing > -- Robert Gentleman, PhD Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M2-B876 PO Box 19024 Seattle, Washington 98109-1024 206-667-7700 [email protected] _______________________________________________ Bioc-sig-sequencing mailing list [email protected] https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
