________________________________________ > De : David Eccles (gringer) [david.ecc...@mpi-muenster.mpg.de] > Date d'envoi : 1 juillet 2011 10:59 > À : denovoassembler-users@lists.sourceforge.net > Objet : [Denovoassembler-users] purpose of _getOutgoingEdges / > _getIngoingEdges > > I'm trying to work through this big Kmer colour-space change, and I'm > getting a little stuck because there's Kmer manipulation code all over > the place. While I was running through the unit tests to try to weed out > code that touches Kmers (to be shifted into the Kmer class), I came > across the _getOutgoingEdges / _getIngoingEdges functions, and can't > work out what they are trying to do. >
Routines for k-mers are in core/common_functions.h and structures/Kmer.h -- it would not call that 'all over the place' given the ~20k lines of Ray. For any pair of reverse-complement k-mers, only the lower is stored. 2 uint8_t are used to store their arcs -- one uint8_t for each. I use the same encoding than in ABySS for that -- 1 bit per arc. > The comments for these functions don't really mention anything about the > purpose of the function, which makes it difficult for me to work out how > to implement it for the modified Kmer structure: > I just added some comments for these functions. > [_getOutgoingEdges] > /* > * abcd efgh > * 00ab 00ef > * 00ab cdef > */ > > [_getIngoingEdges] > > // 1 0 > // > // 127..64 63...0 > // > // 00abcdefgh ijklmnopqr // initial state > // abcdefgh00 klmnopqr00 // shift left > // abcdefghij klmnopqr00 // copy the last to the first > // 00cdefghij klmnopqr00 // reset the 2 last > > I get that it's moving sequence around, and that 'ab' describes a 2-bit > location, rather than 2 base-pairs, but that's about the limit of my > guesses. There's a mention of 'changing the hash value of the Kmer', > which I presume happens when you have un-used bits in the Kmer (e.g. > with kmer length 20). > > Can anyone offer any other insights to this? > > Oh, and BTW, there may be range check problems in the current code > (1.6.0+) with Kmer sizes greater than 100. The kmerAtPosition function > uses char sequence[100] together with memcpy, then overwrites > sequence[wordSize] with a null character (even if wordSize > 100). I'm > not sure what the effect of this is, but it probably shouldn't be doing > that. > Replaced 100 with MAXKMERLENGTH. > -- David > > ------------------------------------------------------------------------------ > All of the data generated in your IT infrastructure is seriously valuable. > Why? It contains a definitive record of application performance, security > threats, fraudulent activity, and more. Splunk takes this data and makes > sense of it. IT sense. And common sense. > http://p.sf.net/sfu/splunk-d2d-c2 > _______________________________________________ > Denovoassembler-users mailing list > Denovoassembler-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/denovoassembler-users > ------------------------------------------------------------------------------ All of the data generated in your IT infrastructure is seriously valuable. Why? It contains a definitive record of application performance, security threats, fraudulent activity, and more. Splunk takes this data and makes sense of it. IT sense. And common sense. http://p.sf.net/sfu/splunk-d2d-c2 _______________________________________________ Denovoassembler-users mailing list Denovoassembler-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/denovoassembler-users