Salut Sebastien, Is the plugin ready? Is it integrated in the Ray assembly or used on .fasta contigs after assembly is done? I noticed the files for it have been uploaded.
Adrian On 3/18/2013 5:15 PM, Sébastien Boisvert wrote: > Hi, > > On 18/03/13 12:20 PM, Adrian Pelin wrote: >> Hello, >> >> It seems like the answer I got from the velvet mailing list for this issue >> is that there is no solution. >> Is there a strategy I could use use with Ray to avoid getting the following >> issue?: >> >> My organism seems to be full of SNPs in a perfect 50/50 ratio which is >> probably due to it being diploid. My expirience with assembling velvet >> data is that it generates multiple contigs with very high nucleotide >> identity between some contigs. The only diffrences are SNPs. >> >> I was wondering, is there any way to assemble only the haploid genome >> for a start? I am afraid to overestimate the haploid genome size. Also, >> velvet doesn't generate identical contigs for each piece of sequence, >> just in some cases there are giant contigs over a few kb overlapping. >> >> Any strategy to avoid this or remove these from assembly? My data is >> MiSeq fragments 300bp and hiseq mate pair jumping lib 3kb. >> > I happen to be working on exactly this problem in Ray today > (I have been working on that for a few weeks now). > > > See these two tickets: > > * https://github.com/sebhtml/ray/issues/136 > > * https://github.com/sebhtml/ray/issues/153 > > > The thing is that in a de Bruijn graph (such as the one in Velvet or Ray), a > variation of one nucleotide > leads to alternate branches containing k vertices. > > > A typical SNP in a de Bruijn graph (in Ray Cloud Browser): > > => > http://genome.ulaval.ca:10111/client/?map=0§ion=0®ion=1&location=132207&zoom=1.191270483217418 > > > > From an algorithm point of view, if you use a large k-mer length, > assemblers will spawn contigs for each > allele because each branch will be "good enough". > > Therefore, some of these assembly seeds need to be filtered out. As far as I > know, all de Bruijn assemblers have > this problem right now with large kmers. > > > > The two issues above should be fixed this week by this new plugin in Ray: > > => > https://github.com/sebhtml/ray/tree/master/code/SpuriousSeedAnnihilator > > As its name suggests, SpuriousSeedAnnihilator will annilihate spurious seeds > which otherwise will lead > to duplicated genetic regions. > > -Séb > >> Adrian >> >> >> >> >> ------------------------------------------------------------------------------ >> Everyone hates slow websites. So do we. >> Make your web apps faster with AppDynamics >> Download AppDynamics Lite for free today: >> http://p.sf.net/sfu/appdyn_d2d_mar >> _______________________________________________ >> Denovoassembler-users mailing list >> [email protected] >> https://lists.sourceforge.net/lists/listinfo/denovoassembler-users >> > > ------------------------------------------------------------------------------ > Everyone hates slow websites. So do we. > Make your web apps faster with AppDynamics > Download AppDynamics Lite for free today: > http://p.sf.net/sfu/appdyn_d2d_mar > _______________________________________________ > Denovoassembler-users mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/denovoassembler-users > ------------------------------------------------------------------------------ Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_d2d_mar _______________________________________________ Denovoassembler-users mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/denovoassembler-users
