[julia-users] Re: Performance improvements double digest problem

Daniel Jones Fri, 26 Sep 2014 11:59:03 -0700

Hi Paul,

Have you looked at the performance chapter in the manual? Some of that 
could help: http://julia.readthedocs.org/en/latest/manual/performance-tips/


There are few things you can do to speed this up:

Declare your globals with 'const'. Use sort! instead of sort to avoid 
allocating a new array. 

Lastly, vector operations like what you do in the Energy function allocate 
intermediate arrays which can slow things down. You can automatically 
"devectorize" this with the Devoctorize package, which lets you rewrite 
your Energy function like

function Energy(original, current)
  @devec ans = sum((original - current).^2 ./ original);
  return ans
end

That gives a nearly 4x speed up when combined with the other two changes I 
mentioned.

P.S. You don't need semicolons at the end of lines in Julia. We usually 
only use them if we need to put two statements on the same line.


On Friday, September 26, 2014 10:53:56 AM UTC-7, Paul Lange wrote:
>
> Hey,
>
> I'm currently working on the double digest problem [1] for a homework. I 
> first wrote the program in Matlab, but that became to slow for larger data 
> sets. So I switched to julia in hope of some
> speedup. Indeed by naively converting I get roughly a 5x increase: 
> 0.734531s in Matlab for 10k iterations.
> @time in julia returns the following after some runs: elapsed time: 
> 0.159896259 seconds (27817864 bytes allocated, 20.24% gc time)
>
> However I hope I can squeeze out some more performance and hope you can 
> give me some hints where to look at. The 20 Mbyte allocations seems quite a 
> lot, however I create a small new vector in the Neighbor function every 
> iteration. I was also unable to run the profiler since it keeps crashing 
> julia (julia 0.3.1, Mac).
>
>
> Code can be found here: 
> https://gist.github.com/palango/04599a4d553068189d96
>
> Some words about it:
> We've got two enzymes A and B that can cut DNA. When A is applied to some 
> DNA some fragments are created and their length can be measured. The sorted 
> lengths are stored in a and the same applies for b.
> If the same DNA gets first digested by A and then by B we get more 
> fragements. Their length is stored in c.
> The problem is now to find the correct order of fragments in a and b so 
> that we get the same fragments as in c.
> The program implements a simulated annealing approach. confa and confb 
> store the permutation auf a and b, which are used to calculate the c of the 
> current permutations.
>
> Thanks in advance,
> Paul
>
> [1] https://en.wikipedia.org/wiki/Restriction_map
>

[julia-users] Re: Performance improvements double digest problem

Reply via email to