On Wed, Nov 9, 2011 at 12:03 PM, Bob Harris <rshar...@bx.psu.edu> wrote: > David, in my experience with Illumina sequencing, it looks like the reads at > the start of a file have a much higher sequencing error rate. > Bob H
Yes, reads at the start and the end of the file come from the edge of the Illumina slide, and tend to be of poorer quality that the reads from the middle. So depending on the purpose in mind, picking 5 million reads from the middle of the file might be fine (and much easier computationally). Peter ___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/