If you haven't already seen this, there is also: https://github.com/timholy/ProfileView.jl
-viral On Wednesday, January 1, 2014 4:50:57 AM UTC+5:30, Brendan O'Connor wrote: > > I've been trying to optimize a sampling routine and have found the > flat-formatted profiler output very useful. But I'm finding it a little > tricky to interpret sometimes. I have a slightly tweaked version that > reports percentage of samples (instead of the sample count), and also shows > the original line from the file (ideally, an HTML or GUI view would be > best...) Here are lines with >1% incidence. (Also posted at > https://gist.github.com/brendano/8202837 ). I've bolded a few lines: > > 13150 samples ~ 0.4% se, though more in reality > File Function > Line > 50.3 % treetm.jl cgsIterPath 383 qnewLL,pnewLL = > proposePath!(newpath, mm, V,di,word, first ? nothing : oldpath, :simulate) > 44.1 % treetm.jl cgsIterPath 384 qoldLL,poldLL = > proposePath!(oldpath, mm, V,di,word, first ? nothing : oldpath, :evaluate) > 4.1 % treetm.jl getindex 79 getindex(c::ClassCountTable, k) > = c.counts[k] > 2.1 % treetm.jl incrementFullpath! 122 x = b ? x.right : x.left > 22.0 % treetm.jl proposePath! 337 w0 = (n0.counts[wordID] > + betaHere/V - on0) / (n0.counts.total + betaHere - int(on_cur)) > 18.9 % treetm.jl proposePath! 338 w1 = (n1.counts[wordID] > + betaHere/V - on1) / (n1.counts.total + betaHere - int(on_cur)) > 5.2 % treetm.jl proposePath! 339 p0 = > (cur_docnode.left.count + mm.gammaConc/2 - on0) > 6.5 % treetm.jl proposePath! 340 p1 = > (cur_docnode.right.count+ mm.gammaConc/2 - on1) > * 8.1 % treetm.jl proposePath! 341 q = p1*w1 / (p1*w1 + > p0*w0)* > 2.3 % treetm.jl proposePath! 344 dec = bool(rand() < > q) > 1.1 % treetm.jl proposePath! 345 path[curdepth] = dec > * 8.5 % treetm.jl proposePath! 350 proposalLL += dec ? > log(q) : log(1-q)* > 11.4 % treetm.jl proposePath! 351 unnormLL += dec ? > log(p1/(p0+p1)) : log(p0/(p0+p1)) # only the prior branch prob > 4.2 % treetm.jl proposePath! 367 wp = (nn.counts[wordID] + > mm.betaConc/V +rem) / (nn.counts.total + mm.betaConc +rem) > * 2.0 % math.jl log 290 * > 100.0 % no file anonymous 14 > > I'm trying to understand what makes up the large runtime of line 350. I > would think it is slow because of the logarithm. However, math.jl's log() > is only 2%, which is a lot less than 8.5%. Can the branching decision for > 'dec', and the += operation, really be responsible for the rest of the time? > > Or, is it wrong to interpret this so directly? I'm now wondering if > there was some sort of compiler optimization, so that q's calculation (line > 341) is actually computed later around the time of line 350, at least as > how the profiler reports it? > > Unfortunately, since proposePath!() is a fairly long function, I can't > make heads or tails of code_llvm(). I posted more at the github gist if > anyone is curious. > > Another thing I'm curious about is whether object attribute lookups are > fast: in particular, if they might be a cause of slowdown for lines 337 and > 338. This was my experience with Cython (that eliminating slow python-land > lookups gave a lot of speedup), but maybe it's the wrong way to think about > Julia. Are abstract types more problematic than concrete types, for > attribute lookup? (I think I saw an effect there, but don't have a > detailed example at the moment. The current implementation the above > profiling output stems from uses a concrete node type.) > > Brendan > >
