[julia-users] Performance improvements double digest problem

Paul Lange Fri, 26 Sep 2014 11:43:02 -0700

Hey,

I'm currently working on the double digest problem [1] for a homework. I 
first wrote the program in Matlab, but that became to slow for larger data 
sets. So I switched to julia in hope of some
speedup. Indeed by naively converting I get roughly a 5x increase: 
0.734531s in Matlab for 10k iterations.
@time in julia returns the following after some runs: elapsed time: 
0.159896259 seconds (27817864 bytes allocated, 20.24% gc time)


However I hope I can squeeze out some more performance and hope you can 
give me some hints where to look at. The 20 Mbyte allocations seems quite a 
lot, however I create a small new vector in the Neighbor function every 
iteration. I was also unable to run the profiler since it keeps crashing 
julia (julia 0.3.1, Mac).


Code can be found here: https://gist.github.com/palango/04599a4d553068189d96

Some words about it:
We've got two enzymes A and B that can cut DNA. When A is applied to some 
DNA some fragments are created and their length can be measured. The sorted 
lengths are stored in a and the same applies for b.
If the same DNA gets first digested by A and then by B we get more 
fragements. Their length is stored in c.
The problem is now to find the correct order of fragments in a and b so 
that we get the same fragments as in c.
The program implements a simulated annealing approach. confa and confb 
store the permutation auf a and b, which are used to calculate the c of the 
current permutations.

Thanks in advance,
Paul

[1] https://en.wikipedia.org/wiki/Restriction_map

[julia-users] Performance improvements double digest problem

Reply via email to