The profiler can never interfere with the optimizer. It's a sampling profiler, 
so it doesn't affect code generation at all.

I'm sick of explaining this stuff (no fault of yours :-), you just happened to 
win the lottery), so I submitted a pull request to improve the documentation:
https://github.com/JuliaLang/julia/pull/8145

Given your assumptions, you especially need to read this section carefully 
(memory allocation is the "canary in the coal mine"):
https://github.com/JuliaLang/julia/blob/b9a695669d8f8aa7648188790d3c71b6ce7effec/doc/manual/performance-tips.rst#measure-performance-with-time-and-pay-attention-to-memory-allocation

As far as comparing to versions of magic that do allocate their output, simply 
define

magic(n::Integer) = magic!(Array(Int, n, n))

This way you can have your cake and eat it too. Measuring and interpreting the 
performance of magic! is far easier than for magic, which is why you should 
probably do all profiling, etc on it.

--Tim

On Tuesday, August 26, 2014 06:11:47 AM Phillip Berndt wrote:
> > (1) allocate the output M outside of the core algorithm, and pass it as an
> > input, i.e.,
> 
> I did that, though it can be argued that this is cheating given that the
> competitors also have to allocate an array for each loop. With that version
> (and some more slight optimization: Storing intermediate values in the for
> loops, using column-major indexing and @simd)
> [https://gist.github.com/phillipberndt/7dc0aed7eb855f900f0d/21cce76664bdc59f
> 6203ff6f3496e80e256f54cb], the overall time for the N=3..1000 test case is
> down to 3.67s.
> 
> (2) @time (for i = 1:100; magic!(M); end). Did it allocate any memory? Then
> 
> > you have a problem. Use the profiler, or run julia with --track-
> > allocation=user, to find out where it occurs.
> 
> It does, about 3 Mb on line 2 (if n % 2 == 1). Doesn't make much sense so I
> guess the profiler interfered with the optimizer here?! I doubt that trying
> to get rid of the 3Mb will gain another second though.
> 
> > (3) Even if it's not allocating, you may have a bottleneck. Use the
> > profiler to
> > find it.
> 
> The line where the most time is spent is line 11, filling the array in the
> odd case. I don't see how it could be optimized any further, so that's
> probably as far as one gets?!
> 
> - Phillip

Reply via email to