> I think the quirkiest transformation is from a junk-filled base-space to > colour-space: > > ZZAC6ACGCXAAATAT55ACTCCAGCTCC..RCA.. > -> NNACNACGCNAAATATNNACTCCAGCTCCNNNCANN > -> ACNACGCNAAATATNNACTCCAGCTCCNNNCA > -> 1101331000333300122012322010011 >
Perhaps these 'junk-filled' sequences should not be used at all as they are mostly worthless. > ________________________________________ > De : David Eccles (gringer) [david.ecc...@mpi-muenster.mpg.de] > Date d'envoi : 28 juin 2011 13:49 > À : Sébastien Boisvert > Cc : denovoassembler-users@lists.sourceforge.net > Objet : Re: combined colour space / base space encoding in Read structure > > Sébastien Boisvert wrote: > > First base should be recorded in structures/Read. > > Also, symbols (from {0,1,2,3} or from {A,C,G,T}) are encoded in 2 > > bits in Ray so I think your changes will likely break support for > > color-space (because of the N). > > I've had a go at storing first base + colour space sequence in the Read > structure. The short summary is that you can now read in one type, and > pull out the other type (with or without double-encoding). It now trims > off the first base (if extracting out colour-space sequence), because I > didn't want to disrupt this elsewhere too much [yet]. > > I need to go a bit deeper and replicate this kind of format for Kmers > for it to be really useful, but I was able to fit things in this far > without changing too much other code. > > My most interesting commit is probably this one: > > https://github.com/gringer/ray/commit/95efe30674b6bea14bc68c90d8e65c261ecbe3ed > > Although this one makes it actually usable (well, almost). > > https://github.com/gringer/ray/commit/b41673e9f44b59354ce5352d3e718ef28e9dafa2 > > > Can you test on http://solidsoftwaretools.com/gf/project/ecoli50x50/ > > to see if it works or fails ? > > I'll do that in the next day or so, but my most recent work has been > trying to set up unit tests to make sure things are doing what I expect: > > $ CODE=../code; g++ $CODE/format/ColorSpaceCodec.cpp \ > $CODE/structures/Read.cpp \ > $CODE/core/common_functions.cpp \ > $CODE/memory/malloc_types.cpp \ > $CODE/memory/allocator.cpp \ > $CODE/memory/MyAllocator.cpp \ > $CODE/memory/ReusableMemoryStore.cpp \ > $CODE/structures/Kmer.cpp \ > $CODE/cryptography/crypto.cpp \ > unit_tests.cpp -I$CODE -I.. > > $ ./a.out > Checking ColorSpaceCodec: > 1: checking colour-space decode (junk characters)... success! > 2: checking colour-space decode (fully informative sequence)... success! > 3: checking colour-space decode (inverse function actions)... success! > 4: checking colour-space decode (reverse decode)... success! > Checking Read: > 1: checking colour-space encoding converted to double-encoded > base-space... warning: useless double-encoding requested for base-space > output... success! > 2: checking colour-space encoding converted to colour-space... success! > 3: checking colour-space encoding converted to double-encoded > colour-space... success! > 4: checking colour-space encoding with misreads converted to > base-space... success! > 5: checking colour-space encoding with misreads converted to > colour-space... success! > 6: checking base-space encoding converted to colour-space... success! > 7: checking base-space encoding with misreads converted to > colour-space... success! > 7: checking base-space encoding with misreads converted to base-space... > success! > > > I think the quirkiest transformation is from a junk-filled base-space to > colour-space: > > ZZAC6ACGCXAAATAT55ACTCCAGCTCC..RCA.. > -> NNACNACGCNAAATATNNACTCCAGCTCCNNNCANN > -> ACNACGCNAAATATNNACTCCAGCTCCNNNCA > -> 1101331000333300122012322010011 > > Will keep you posted about my attempts at getting this working. > > Cheers, > David Eccles Sébastien ------------------------------------------------------------------------------ All of the data generated in your IT infrastructure is seriously valuable. Why? It contains a definitive record of application performance, security threats, fraudulent activity, and more. Splunk takes this data and makes sense of it. IT sense. And common sense. http://p.sf.net/sfu/splunk-d2d-c2 _______________________________________________ Denovoassembler-users mailing list Denovoassembler-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/denovoassembler-users