This might be an algorithm that benefits from the GC improvements in 
0.4--with the caveat that 0.4 is still a work in progress and has a variety 
of syntax changes (and possibly more to come) you might want to give that a 
try.

On Friday, April 3, 2015 at 8:22:49 PM UTC-5, Adam Labadorf wrote:
>
> Hi,
>
> I am struggling with an issue related to garbage collection taking up the 
> vast majority (>99%) of compute time on a simple nested for loop. Code 
> excerpt below:
>
> # julia version 0.3.7
> # counts is an MxN matrix of Float64
> # N=15000
> # M=108
> # h_cols and c_cols are indices \in {1:M}
> using HypothesisTests, ArrayViews
> ratios = Array(Float64,M)
> function compute!(S,tot_i::Int64,i::Int64,j::Int64)
>     ratios = view(counts,:,i)./view(counts,:,j)
>     #for k=1:M
>     #  ratios[k] = counts[k,i]/counts[k,j]
>     #end
>     #ratios = counts[:,i]./counts[:,j]
>     t = UnequalVarianceTTest(ratios[h_cols],ratios[c_cols])
>     S[tot_i] = (pvalue(t),i,j)
> end
>
> for i=1:N-1
>   @time for j=(i+1):N
>     tot_i += 1
>     compute!(S,tot_i,i,j)
>   end
> end
>
> The loop begins fast, output from time:
>
> elapsed time: 1.023850054 seconds (62027220 bytes allocated)
> elapsed time: 0.170916977 seconds (45785624 bytes allocated)
> elapsed time: 0.171598156 seconds (45782760 bytes allocated)
> elapsed time: 0.173866309 seconds (45779896 bytes allocated)
> elapsed time: 0.170267172 seconds (45777032 bytes allocated)
> elapsed time: 0.171754713 seconds (45774168 bytes allocated)
> elapsed time: 0.170110142 seconds (45771304 bytes allocated)
> elapsed time: 0.175199053 seconds (45768440 bytes allocated)
> elapsed time: 0.179893161 seconds (45765576 bytes allocated)
> elapsed time: 0.212172824 seconds (45762712 bytes allocated)
> elapsed time: 0.252750549 seconds (45759848 bytes allocated)
> elapsed time: 0.254874855 seconds (45756984 bytes allocated)
> elapsed time: 0.231003319 seconds (45754120 bytes allocated)
> elapsed time: 0.235060195 seconds (45751256 bytes allocated)
> elapsed time: 0.235379355 seconds (45748392 bytes allocated)
> elapsed time: 0.927622743 seconds (45746168 bytes allocated, 77.65% gc 
> time)
> elapsed time: 0.9132931 seconds (45742664 bytes allocated, 78.35% gc time)
>
> But as soon as it starts doing gc the % time spent in increases almost 
> indefinitely, output from time much later:
>
> elapsed time: 0.174122929 seconds (36239160 bytes allocated)
> elapsed time: 18.535572658 seconds (36236168 bytes allocated, 99.22% gc 
> time)
> elapsed time: 19.189478819 seconds (36233176 bytes allocated, 99.26% gc 
> time)
> elapsed time: 21.812889439 seconds (36230184 bytes allocated, 99.35% gc 
> time)
> elapsed time: 22.182467723 seconds (36227192 bytes allocated, 99.30% gc 
> time)
> elapsed time: 0.169849999 seconds (36224200 bytes allocated)
>
> The inner loop, despite iterating over fewer and fewer indices has 
> massively increased the gc, and therefore overall, execution time. I have 
> tried many things, including creating the compute function, devectorizing 
> the ratios calculation (which really slowed things down), using view and 
> sub in various places, profiling with --trace-allocation=all but I can't 
> figure out why this happens or how to fix it. Doing gc for 22 seconds 
> obviously kills the performance, and since there are about 22M iterations 
> this is prohibitive. Can anyone suggest something I can try to improve the 
> performance here?
>
> Thanks,
> Adam
>

Reply via email to