________________________________________ > De : David Eccles (gringer) [david.ecc...@mpi-muenster.mpg.de] > Date d'envoi : 1 juillet 2011 11:20 > À : denovoassembler-users@lists.sourceforge.net > Objet : Re: [Denovoassembler-users] purpose of _getOutgoingEdges / > _getIngoingEdges > > On 01/07/11 16:59, David Eccles (gringer) wrote: >> I get that it's moving sequence around, and that 'ab' describes a 2-bit >> location, rather than 2 base-pairs, but that's about the limit of my >> guesses. > > Possibly some more insights from the Vertex code? > > /* > * the vertex is important in the algorithm > * a DNA sequence is simply an ordered array of vertices. Two consecutive > * vertices always respect the de Bruijn property. > * a Vertex actually stores two k-mers: only the lower is stored. > * This halves the memory usage. > */ > > ... > > uint8_t m_edges_lower; > uint8_t m_edges_higher; > > So, are the edges 8 bits (i.e. 4 bases) from each joining Kmer? >
Yes, this idea is described in the ABySS paper. 1 bit per possible arc -- 4 in and 4 out. > I've notice that these functions only seem to be used / callled in > SeedWorker: > > > $ grep -rl _getIngoingEdges * > assembler/SeedWorker.cpp > core/common_functions.h > core/common_functions.cpp > structures/Vertex.cpp > > So I presume it has something to do with finding / extending seeds??? > Pretty much. > -- David > Sébastien ------------------------------------------------------------------------------ All of the data generated in your IT infrastructure is seriously valuable. Why? It contains a definitive record of application performance, security threats, fraudulent activity, and more. Splunk takes this data and makes sense of it. IT sense. And common sense. http://p.sf.net/sfu/splunk-d2d-c2 _______________________________________________ Denovoassembler-users mailing list Denovoassembler-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/denovoassembler-users