On Sat, Apr 2, 2016 at 10:26 AM, Cedric St-Jean <[email protected]> wrote: > >> Therefore there's no way the compiler can rewrite the slow version to the >> fast version. > > > It knows that the element type is a Feature, so it could produce: > > if isa(features[i], A) > retval += evaluate(features[i]::A) > elseif isa(features[i], B) > retval += evaluate(features[i]::B) > else > retval += evaluate(features[i]) > end
This is kind of the optimization I mentioned but no this will still be much slower than the other version. The compiler has no idea what the return type of the third one so this version is still type unstable and you get dynamic dispatch at every iteration for the floating point add. Of course there's more sophisticated transformation that can keep you in the fast path as long as possible and create extra code to check and handle the slow cases but it will still be slower. I also recommand Jeff's talk[1] for a better explaination of the general idea. [1] https://www.youtube.com/watch?v=cjzcYM9YhwA > > and it would make sense for abstract types that have few subtypes. I didn't > realize that dispatch was an order of magnitude slower than type checking. > It's easy enough to write a macro generating this expansion, too. > > On Saturday, April 2, 2016 at 2:05:20 AM UTC-4, Yichao Yu wrote: >> >> On Fri, Apr 1, 2016 at 9:56 PM, Tim Wheeler <[email protected]> wrote: >> > Hello Julia Users. >> > >> > I ran into a weird slowdown issue and reproduced a minimal working >> > example. >> > Maybe someone can help shed some light. >> > >> > abstract Feature >> > >> > type A <: Feature end >> > evaluate(f::A) = 1.0 >> > >> > type B <: Feature end >> > evaluate(f::B) = 0.0 >> > >> > function slow(features::Vector{Feature}) >> > retval = 0.0 >> > for i in 1 : length(features) >> > retval += evaluate(features[i]) >> > end >> > retval >> > end >> > >> > function fast(features::Vector{Feature}) >> > retval = 0.0 >> > for i in 1 : length(features) >> > if isa(features[i], A) >> > retval += evaluate(features[i]::A) >> > else >> > retval += evaluate(features[i]::B) >> > end >> > end >> > retval >> > end >> > >> > using ProfileView >> > >> > features = Feature[] >> > for i in 1 : 10000 >> > push!(features, A()) >> > end >> > >> > slow(features) >> > @time slow(features) >> > fast(features) >> > @time fast(features) >> > >> > The output is: >> > >> > 0.000136 seconds (10.15 k allocations: 166.417 KB) >> > 0.000012 seconds (5 allocations: 176 bytes) >> > >> > >> > This is a HUGE difference! Am I missing something big? Is there a good >> > way >> > to inspect code to figure out where I am going wrong? >> >> This is because of type instability as you will find in the performance >> tips. >> Note that slow and fast are not equivalent since the fast version only >> accept `A` or `B` but the slow version accepts any subtype of feature >> that you may ever define. Therefore there's no way the compiler can >> rewrite the slow version to the fast version. >> There are optimizations that can be applied to bring down the gap but >> there'll always be a large difference between the two. >> >> > >> > >> > Thank you in advance for any guidance. >> > >> > >> > -Tim >> > >> > >> > >> > >> >
