Perhaps a simpler solution would be to just assert that `evaluate` always
returns a Float64? You may even be able to remove the isa branches in that
case, but I'm not sure how it'll compare.
On Saturday, April 2, 2016 at 4:34:09 PM UTC-4, Cedric St-Jean wrote:
>
> Thank you for the detailed explanation. I tried it out:
>
> function pretty_fast(features::Vector{Feature})
> retval = 0.0
> for i in 1 : length(features)
> if isa(features[i], A)
> x = evaluate(features[i]::A)
> elseif isa(features[i], B)
> x = evaluate(features[i]::B)
> else
> x = evaluate(features[i])
> end
> retval += x
> end
> retval
> end
>
> On my laptop, fast runs in 10 microseconds, pretty_fast in 30, and slow in
> 210.
>
> On Saturday, April 2, 2016 at 12:24:18 PM UTC-4, Yichao Yu wrote:
>>
>> On Sat, Apr 2, 2016 at 12:16 PM, Tim Wheeler <[email protected]>
>> wrote:
>> > Thank you for the comments. In my original code it means the difference
>> > between a 30 min execution with memory allocation in the Gigabytes and
>> a few
>> > seconds of execution with only 800 bytes using the second version.
>> > I thought under-the-hood Julia basically runs those if statements
>> anyway for
>> > its dispatch, and don't know why it needs to allocate any memory.
>> > Having the if-statement workaround will be fine though.
>>
>> Well, if you have a lot of these cheap functions being dynamically
>> dispatched I think it is not a good way to use the type. Depending on
>> your problem, you may be better off using a enum/flags/dict to
>> represent the type/get the values.
>>
>> The reason for the allocation is that the return type is unknown. It
>> should be obvious to see if you check your code with code_warntype.
>>
>> >
>> > On Saturday, April 2, 2016 at 7:26:11 AM UTC-7, Cedric St-Jean wrote:
>> >>
>> >>
>> >>> Therefore there's no way the compiler can rewrite the slow version to
>> the
>> >>> fast version.
>> >>
>> >>
>> >> It knows that the element type is a Feature, so it could produce:
>> >>
>> >> if isa(features[i], A)
>> >> retval += evaluate(features[i]::A)
>> >> elseif isa(features[i], B)
>> >> retval += evaluate(features[i]::B)
>> >> else
>> >> retval += evaluate(features[i])
>> >> end
>> >>
>> >> and it would make sense for abstract types that have few subtypes. I
>> >> didn't realize that dispatch was an order of magnitude slower than
>> type
>> >> checking. It's easy enough to write a macro generating this expansion,
>> too.
>> >>
>> >> On Saturday, April 2, 2016 at 2:05:20 AM UTC-4, Yichao Yu wrote:
>> >>>
>> >>> On Fri, Apr 1, 2016 at 9:56 PM, Tim Wheeler <[email protected]>
>> >>> wrote:
>> >>> > Hello Julia Users.
>> >>> >
>> >>> > I ran into a weird slowdown issue and reproduced a minimal working
>> >>> > example.
>> >>> > Maybe someone can help shed some light.
>> >>> >
>> >>> > abstract Feature
>> >>> >
>> >>> > type A <: Feature end
>> >>> > evaluate(f::A) = 1.0
>> >>> >
>> >>> > type B <: Feature end
>> >>> > evaluate(f::B) = 0.0
>> >>> >
>> >>> > function slow(features::Vector{Feature})
>> >>> > retval = 0.0
>> >>> > for i in 1 : length(features)
>> >>> > retval += evaluate(features[i])
>> >>> > end
>> >>> > retval
>> >>> > end
>> >>> >
>> >>> > function fast(features::Vector{Feature})
>> >>> > retval = 0.0
>> >>> > for i in 1 : length(features)
>> >>> > if isa(features[i], A)
>> >>> > retval += evaluate(features[i]::A)
>> >>> > else
>> >>> > retval += evaluate(features[i]::B)
>> >>> > end
>> >>> > end
>> >>> > retval
>> >>> > end
>> >>> >
>> >>> > using ProfileView
>> >>> >
>> >>> > features = Feature[]
>> >>> > for i in 1 : 10000
>> >>> > push!(features, A())
>> >>> > end
>> >>> >
>> >>> > slow(features)
>> >>> > @time slow(features)
>> >>> > fast(features)
>> >>> > @time fast(features)
>> >>> >
>> >>> > The output is:
>> >>> >
>> >>> > 0.000136 seconds (10.15 k allocations: 166.417 KB)
>> >>> > 0.000012 seconds (5 allocations: 176 bytes)
>> >>> >
>> >>> >
>> >>> > This is a HUGE difference! Am I missing something big? Is there a
>> good
>> >>> > way
>> >>> > to inspect code to figure out where I am going wrong?
>> >>>
>> >>> This is because of type instability as you will find in the
>> performance
>> >>> tips.
>> >>> Note that slow and fast are not equivalent since the fast version
>> only
>> >>> accept `A` or `B` but the slow version accepts any subtype of feature
>> >>> that you may ever define. Therefore there's no way the compiler can
>> >>> rewrite the slow version to the fast version.
>> >>> There are optimizations that can be applied to bring down the gap but
>> >>> there'll always be a large difference between the two.
>> >>>
>> >>> >
>> >>> >
>> >>> > Thank you in advance for any guidance.
>> >>> >
>> >>> >
>> >>> > -Tim
>> >>> >
>> >>> >
>> >>> >
>> >>> >
>> >>> >
>>
>