Is there any chance of implementing digital normalisation natively in Ray? I'm currently using Trinity's paired-end normalisation procedure, but I expect it would be much slower than what Ray would be able to achieve with a native version:
http://trinityrnaseq.sourceforge.net/trinity_insilico_normalization.html http://ivory.idyll.org/blog/what-is-diginorm.html The general process is to look at the distribution of counts of kmers in each sequence to get an idea of coverage, then decide on whether to include or discard a read based on that count distribution together with a target coverage. If assemblies work better at 100X coverage than 500X coverage (or there is no substantial benefit from higher coverage), then it makes sense to discard reads when you can be pretty confident that you've already reached 100X coverage for all (or most) kmers in that read. - David ------------------------------------------------------------------------------ Learn Graph Databases - Download FREE O'Reilly Book "Graph Databases" is the definitive new guide to graph databases and their applications. Written by three acclaimed leaders in the field, this first edition is now available. Download your free book today! http://p.sf.net/sfu/13534_NeoTech _______________________________________________ Denovoassembler-users mailing list Denovoassembler-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/denovoassembler-users