If you haven't already seen this, there is also:

https://github.com/timholy/ProfileView.jl

-viral

On Wednesday, January 1, 2014 4:50:57 AM UTC+5:30, Brendan O'Connor wrote:
>
> I've been trying to optimize a sampling routine and have found the 
> flat-formatted profiler output very useful.  But I'm finding it a little 
> tricky to interpret sometimes.  I have a slightly tweaked version that 
> reports percentage of samples (instead of the sample count), and also shows 
> the original line from the file (ideally, an HTML or GUI view would be 
> best...)  Here are lines with >1% incidence.  (Also posted at 
> https://gist.github.com/brendano/8202837 ).  I've bolded a few lines:
>
> 13150 samples ~ 0.4% se, though more in reality
>        File                                            Function           
>  Line
> 50.3 % treetm.jl cgsIterPath          383             qnewLL,pnewLL = 
> proposePath!(newpath, mm, V,di,word, first ? nothing : oldpath, :simulate)
> 44.1 % treetm.jl cgsIterPath          384             qoldLL,poldLL = 
> proposePath!(oldpath, mm, V,di,word, first ? nothing : oldpath, :evaluate)
>  4.1 % treetm.jl getindex              79 getindex(c::ClassCountTable, k) 
> = c.counts[k]
>  2.1 % treetm.jl incrementFullpath!   122         x = b ? x.right : x.left
> 22.0 % treetm.jl proposePath!         337         w0 = (n0.counts[wordID] 
> + betaHere/V - on0) / (n0.counts.total + betaHere - int(on_cur))
> 18.9 % treetm.jl proposePath!         338         w1 = (n1.counts[wordID] 
> + betaHere/V - on1) / (n1.counts.total + betaHere - int(on_cur))
>  5.2 % treetm.jl proposePath!         339         p0 = 
> (cur_docnode.left.count + mm.gammaConc/2 - on0)
>  6.5 % treetm.jl proposePath!         340         p1 = 
> (cur_docnode.right.count+ mm.gammaConc/2 - on1)
> * 8.1 % treetm.jl proposePath!         341         q = p1*w1 / (p1*w1 + 
> p0*w0)*
>  2.3 % treetm.jl proposePath!         344             dec = bool(rand() < 
> q)
>  1.1 % treetm.jl proposePath!         345             path[curdepth] = dec
> * 8.5 % treetm.jl proposePath!         350         proposalLL += dec ? 
> log(q) : log(1-q)*
> 11.4 % treetm.jl proposePath!         351         unnormLL += dec ? 
> log(p1/(p0+p1)) : log(p0/(p0+p1))   # only the prior branch prob
>  4.2 % treetm.jl proposePath!         367     wp = (nn.counts[wordID] + 
> mm.betaConc/V +rem) / (nn.counts.total + mm.betaConc +rem)
> * 2.0 % math.jl   log                  290 *
> 100.0 % no file  anonymous             14 
>
> I'm trying to understand what makes up the large runtime of line 350.  I 
> would think it is slow because of the logarithm.  However, math.jl's log() 
> is only 2%, which is a lot less than 8.5%.  Can the branching decision for 
> 'dec', and the += operation, really be responsible for the rest of the time?
>
> Or, is it wrong to interpret this so directly?  I'm now wondering if 
> there was some sort of compiler optimization, so that q's calculation (line 
> 341) is actually computed later around the time of line 350, at least as 
> how the profiler reports it?
>
> Unfortunately, since proposePath!() is a fairly long function, I can't 
> make heads or tails of code_llvm().  I posted more at the github gist if 
> anyone is curious.
>
> Another thing I'm curious about is whether object attribute lookups are 
> fast: in particular, if they might be a cause of slowdown for lines 337 and 
> 338.  This was my experience with Cython (that eliminating slow python-land 
> lookups gave a lot of speedup), but maybe it's the wrong way to think about 
> Julia.  Are abstract types more problematic than concrete types, for 
> attribute lookup?  (I think I saw an effect there, but don't have a 
> detailed example at the moment.  The current implementation the above 
> profiling output stems from uses a concrete node type.)
>
> Brendan
>
>

Reply via email to