On 12/07/11 16:49, David Eccles (gringer) wrote:
> However, the code I have doesn't seem to have any success in finding seeds.

I added a bit of debugging output, and came up with this:

Rank 0 is selecting optimal read markers [1/1000]
inserting sequence (0): T0312332212212032313211021010112303320130220032131
inserting sequence (1): T0230102032123100220032121302331231100332020012031
inserting sequence (2): T2102202120200310123200323013323110122001100012320
inserting sequence (3): C1312102002001322010320202013210121223010123122033
inserting sequence (4): A2121223331021020132132300201223202232300220310201
...
Rank 0 is creating seeds [1/10020]
Vertex 1-1 test (4): testing C21012121330003101031
Vertex 1-1 test (5): testing C13010130003312121012
Vertex 1-1 test (6): testing A33102002200001000130
Vertex 1-1 test (7): testing T03100010000220020133
Vertex 1-1 test (8): testing G23300202002023312231
Vertex 1-1 test (9): testing A13221332020020200332
...
Vertex 1-1 test (8168): testing T31233221221203231321
...
Continuing (8168): vertex has 1 in, 1 out: T31233221221203231321
Vertex 1-1 test (8168): testing T03123322122120323132
...
Finished (8168): parent has 1 in, 1 out: T03123322122120323132
worker 8168 done: found a seed with 20 nucleotides

[workerIds are in brackets]

So this seed (T31233221221203231321) is rejected, because there's a 
parent vertex (T03123322122120323132) that is closer to the start of the 
sequence. Except that parent vertex is never processed, it's nowhere in 
my output. Note that the first workerId is 4, not 0. I have a suspicion 
that the wonderful "ultimate first seed" is something that should be 
managed by worker 0, who decided to take a holiday.

This probably might not be a problem for imperfect reads, because you'd 
expect breaks in the chain somewhere (so at least *some* sequence would 
map properly, even if the first workers weren't processing). I think the 
problem here is that all vertices are linked to this first one.

I still need to hunt more, but I think I'm getting close to the problem.

--- David


------------------------------------------------------------------------------
AppSumo Presents a FREE Video for the SourceForge Community by Eric 
Ries, the creator of the Lean Startup Methodology on "Lean Startup 
Secrets Revealed." This video shows you how to validate your ideas, 
optimize your ideas and identify your business strategy.
http://p.sf.net/sfu/appsumosfdev2dev
_______________________________________________
Denovoassembler-users mailing list
Denovoassembler-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/denovoassembler-users

Reply via email to