I have a small test-case which is slow when benchmarked from a loop, but 
fast when benchmarked from a function:
using Images

function expensive(img)
        img[1, 2] * img[3, 4] + img[5, 6] - img[7, 8]
end

function benchmark(img)
        for i in 1:1000000
                expensive(img)
        end
end

function main()
        img = Image(float32(randn(10, 10)))

        # this is fast
        gc_disable()
        @time benchmark(img)
        gc_enable()

        # this is slow
        gc_disable()
        @time for i in 1:1000000
                expensive(img)
        end
        gc_enable()
end

main()

As per Tim Holy (Images.jl issue #74) this is because Julia can't inline 
the getindex call when expensive() is being called from a loop rather than 
from a function. Why is that though? Isn't img a local variable, with a 
known type, which should result in a fully type-inferred version of 
expensive()? Why is it specialized even more when it is called "from one 
level deeper"?

Oddly enough, making "img" a global improves performance of the slow case 
by about 50%, and doesn't alter the fast case... Now I'm confused.

Best,
Tim

Reply via email to