> I think the quirkiest transformation is from a junk-filled base-space to
> colour-space:
> 
> ZZAC6ACGCXAAATAT55ACTCCAGCTCC..RCA..
> -> NNACNACGCNAAATATNNACTCCAGCTCCNNNCANN
> -> ACNACGCNAAATATNNACTCCAGCTCCNNNCA
> -> 1101331000333300122012322010011
> 

Perhaps these 'junk-filled' sequences should not be used at all 
as they are mostly worthless.




> ________________________________________
> De : David Eccles (gringer) [david.ecc...@mpi-muenster.mpg.de]
> Date d'envoi : 28 juin 2011 13:49
> À : Sébastien Boisvert
> Cc : denovoassembler-users@lists.sourceforge.net
> Objet : Re: combined colour space / base space encoding in Read structure
> 
> Sébastien Boisvert wrote:
> > First base should be recorded in structures/Read.
> > Also, symbols (from {0,1,2,3} or from {A,C,G,T}) are encoded in 2
> > bits in Ray so I think your changes will likely break support for
> > color-space (because of the N).
> 
> I've had a go at storing first base + colour space sequence in the Read
> structure. The short summary is that you can now read in one type, and
> pull out the other type (with or without double-encoding). It now trims
> off the first base (if extracting out colour-space sequence), because I
> didn't want to disrupt this elsewhere too much [yet].
> 
> I need to go a bit deeper and replicate this kind of format for Kmers
> for it to be really useful, but I was able to fit things in this far
> without changing too much other code.
> 
> My most interesting commit is probably this one:
> 
> https://github.com/gringer/ray/commit/95efe30674b6bea14bc68c90d8e65c261ecbe3ed
> 
> Although this one makes it actually usable (well, almost).
> 
> https://github.com/gringer/ray/commit/b41673e9f44b59354ce5352d3e718ef28e9dafa2
> 
> > Can you test on http://solidsoftwaretools.com/gf/project/ecoli50x50/
> > to see if it works or fails ?
> 
> I'll do that in the next day or so, but my most recent work has been
> trying to set up unit tests to make sure things are doing what I expect:
> 
> $ CODE=../code; g++ $CODE/format/ColorSpaceCodec.cpp \
> $CODE/structures/Read.cpp \
> $CODE/core/common_functions.cpp \
> $CODE/memory/malloc_types.cpp \
> $CODE/memory/allocator.cpp \
> $CODE/memory/MyAllocator.cpp \
> $CODE/memory/ReusableMemoryStore.cpp \
> $CODE/structures/Kmer.cpp \
> $CODE/cryptography/crypto.cpp \
> unit_tests.cpp -I$CODE -I..
> 
> $ ./a.out
> Checking ColorSpaceCodec:
> 1: checking colour-space decode (junk characters)... success!
> 2: checking colour-space decode (fully informative sequence)... success!
> 3: checking colour-space decode (inverse function actions)... success!
> 4: checking colour-space decode (reverse decode)... success!
> Checking Read:
> 1: checking colour-space encoding converted to double-encoded
> base-space... warning: useless double-encoding requested for base-space
> output... success!
> 2: checking colour-space encoding converted to colour-space... success!
> 3: checking colour-space encoding converted to double-encoded
> colour-space... success!
> 4: checking colour-space encoding with misreads converted to
> base-space... success!
> 5: checking colour-space encoding with misreads converted to
> colour-space... success!
> 6: checking base-space encoding converted to colour-space... success!
> 7: checking base-space encoding with misreads converted to
> colour-space... success!
> 7: checking base-space encoding with misreads converted to base-space...
> success!
> 
> 
> I think the quirkiest transformation is from a junk-filled base-space to
> colour-space:
> 
> ZZAC6ACGCXAAATAT55ACTCCAGCTCC..RCA..
> -> NNACNACGCNAAATATNNACTCCAGCTCCNNNCANN
> -> ACNACGCNAAATATNNACTCCAGCTCCNNNCA
> -> 1101331000333300122012322010011
> 
> Will keep you posted about my attempts at getting this working.
> 
> Cheers,
> David Eccles

                                                     Sébastien


------------------------------------------------------------------------------
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security 
threats, fraudulent activity, and more. Splunk takes this data and makes 
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2d-c2
_______________________________________________
Denovoassembler-users mailing list
Denovoassembler-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/denovoassembler-users

Reply via email to