On 02.03.2014 01:49, Hornung, Bastian wrote:
> Why not just filter all the reads with a coverage of 2 out?
> If it's really due to SNPs, then this shouldn't have an effect on the 
> assembly.

The problem is that the starting half-peak should *never* be chosen, and the 
algorithm can be easily improved to exclude that.

Further, for this specific case I would need to filter out reads with coverage 
2-8 for this to work in this specific case, and that is
getting close to a coverage size that would need to be included in the general 
case (one possibility would be 300bp MiSeq with 550bp
fragments and 10X coverage), so this coverage limit would need to be set 
separately for each assembly. I don't like using custom (manual)
solutions to solve a problem that could be better dealt with through automation.

> And a k-mer value of 31 seems to be very low for MiSeq data.
> I normally get good results for 125.

I will try that, thanks. I'm not used to 250bp data, and forgot that this was a 
bacteria, so expected memory consumption to skyrocket when
doing a de-bruijn assembly with that high kmer value. It doesn't seem so bad so 
far (~16GB or so after the read phase vs ~6GB with kmer=31).

 - D

------------------------------------------------------------------------------
Flow-based real-time traffic analytics software. Cisco certified tool.
Monitor traffic, SLAs, QoS, Medianet, WAAS etc. with NetFlow Analyzer
Customize your own dashboards, set traffic alerts and generate reports.
Network behavioral analysis & security monitoring. All-in-one tool.
http://pubads.g.doubleclick.net/gampad/clk?id=126839071&iu=/4140/ostg.clktrk
_______________________________________________
Denovoassembler-users mailing list
Denovoassembler-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/denovoassembler-users

Reply via email to