On 02.03.2014 01:49, Hornung, Bastian wrote: > Why not just filter all the reads with a coverage of 2 out? > If it's really due to SNPs, then this shouldn't have an effect on the > assembly.
The problem is that the starting half-peak should *never* be chosen, and the algorithm can be easily improved to exclude that. Further, for this specific case I would need to filter out reads with coverage 2-8 for this to work in this specific case, and that is getting close to a coverage size that would need to be included in the general case (one possibility would be 300bp MiSeq with 550bp fragments and 10X coverage), so this coverage limit would need to be set separately for each assembly. I don't like using custom (manual) solutions to solve a problem that could be better dealt with through automation. > And a k-mer value of 31 seems to be very low for MiSeq data. > I normally get good results for 125. I will try that, thanks. I'm not used to 250bp data, and forgot that this was a bacteria, so expected memory consumption to skyrocket when doing a de-bruijn assembly with that high kmer value. It doesn't seem so bad so far (~16GB or so after the read phase vs ~6GB with kmer=31). - D ------------------------------------------------------------------------------ Flow-based real-time traffic analytics software. Cisco certified tool. Monitor traffic, SLAs, QoS, Medianet, WAAS etc. with NetFlow Analyzer Customize your own dashboards, set traffic alerts and generate reports. Network behavioral analysis & security monitoring. All-in-one tool. http://pubads.g.doubleclick.net/gampad/clk?id=126839071&iu=/4140/ostg.clktrk _______________________________________________ Denovoassembler-users mailing list Denovoassembler-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/denovoassembler-users