I tried that, but it has no impact on performance, in either pretty_fast or
slow. I don't understand why...
retval += x::Float64
On Saturday, April 2, 2016 at 4:42:01 PM UTC-4, Matt Bauman wrote:
>
> Perhaps a simpler solution would be to just assert that `evaluate` always
> returns a Float64? You may even be able to remove the isa branches in that
> case, but I'm not sure how it'll compare.
>
> On Saturday, April 2, 2016 at 4:34:09 PM UTC-4, Cedric St-Jean wrote:
>>
>> Thank you for the detailed explanation. I tried it out:
>>
>> function pretty_fast(features::Vector{Feature})
>> retval = 0.0
>> for i in 1 : length(features)
>> if isa(features[i], A)
>> x = evaluate(features[i]::A)
>> elseif isa(features[i], B)
>> x = evaluate(features[i]::B)
>> else
>> x = evaluate(features[i])
>> end
>> retval += x
>> end
>> retval
>> end
>>
>> On my laptop, fast runs in 10 microseconds, pretty_fast in 30, and slow
>> in 210.
>>
>> On Saturday, April 2, 2016 at 12:24:18 PM UTC-4, Yichao Yu wrote:
>>>
>>> On Sat, Apr 2, 2016 at 12:16 PM, Tim Wheeler <[email protected]>
>>> wrote:
>>> > Thank you for the comments. In my original code it means the
>>> difference
>>> > between a 30 min execution with memory allocation in the Gigabytes and
>>> a few
>>> > seconds of execution with only 800 bytes using the second version.
>>> > I thought under-the-hood Julia basically runs those if statements
>>> anyway for
>>> > its dispatch, and don't know why it needs to allocate any memory.
>>> > Having the if-statement workaround will be fine though.
>>>
>>> Well, if you have a lot of these cheap functions being dynamically
>>> dispatched I think it is not a good way to use the type. Depending on
>>> your problem, you may be better off using a enum/flags/dict to
>>> represent the type/get the values.
>>>
>>> The reason for the allocation is that the return type is unknown. It
>>> should be obvious to see if you check your code with code_warntype.
>>>
>>> >
>>> > On Saturday, April 2, 2016 at 7:26:11 AM UTC-7, Cedric St-Jean wrote:
>>> >>
>>> >>
>>> >>> Therefore there's no way the compiler can rewrite the slow version
>>> to the
>>> >>> fast version.
>>> >>
>>> >>
>>> >> It knows that the element type is a Feature, so it could produce:
>>> >>
>>> >> if isa(features[i], A)
>>> >> retval += evaluate(features[i]::A)
>>> >> elseif isa(features[i], B)
>>> >> retval += evaluate(features[i]::B)
>>> >> else
>>> >> retval += evaluate(features[i])
>>> >> end
>>> >>
>>> >> and it would make sense for abstract types that have few subtypes. I
>>> >> didn't realize that dispatch was an order of magnitude slower than
>>> type
>>> >> checking. It's easy enough to write a macro generating this
>>> expansion, too.
>>> >>
>>> >> On Saturday, April 2, 2016 at 2:05:20 AM UTC-4, Yichao Yu wrote:
>>> >>>
>>> >>> On Fri, Apr 1, 2016 at 9:56 PM, Tim Wheeler <[email protected]>
>>> >>> wrote:
>>> >>> > Hello Julia Users.
>>> >>> >
>>> >>> > I ran into a weird slowdown issue and reproduced a minimal working
>>> >>> > example.
>>> >>> > Maybe someone can help shed some light.
>>> >>> >
>>> >>> > abstract Feature
>>> >>> >
>>> >>> > type A <: Feature end
>>> >>> > evaluate(f::A) = 1.0
>>> >>> >
>>> >>> > type B <: Feature end
>>> >>> > evaluate(f::B) = 0.0
>>> >>> >
>>> >>> > function slow(features::Vector{Feature})
>>> >>> > retval = 0.0
>>> >>> > for i in 1 : length(features)
>>> >>> > retval += evaluate(features[i])
>>> >>> > end
>>> >>> > retval
>>> >>> > end
>>> >>> >
>>> >>> > function fast(features::Vector{Feature})
>>> >>> > retval = 0.0
>>> >>> > for i in 1 : length(features)
>>> >>> > if isa(features[i], A)
>>> >>> > retval += evaluate(features[i]::A)
>>> >>> > else
>>> >>> > retval += evaluate(features[i]::B)
>>> >>> > end
>>> >>> > end
>>> >>> > retval
>>> >>> > end
>>> >>> >
>>> >>> > using ProfileView
>>> >>> >
>>> >>> > features = Feature[]
>>> >>> > for i in 1 : 10000
>>> >>> > push!(features, A())
>>> >>> > end
>>> >>> >
>>> >>> > slow(features)
>>> >>> > @time slow(features)
>>> >>> > fast(features)
>>> >>> > @time fast(features)
>>> >>> >
>>> >>> > The output is:
>>> >>> >
>>> >>> > 0.000136 seconds (10.15 k allocations: 166.417 KB)
>>> >>> > 0.000012 seconds (5 allocations: 176 bytes)
>>> >>> >
>>> >>> >
>>> >>> > This is a HUGE difference! Am I missing something big? Is there a
>>> good
>>> >>> > way
>>> >>> > to inspect code to figure out where I am going wrong?
>>> >>>
>>> >>> This is because of type instability as you will find in the
>>> performance
>>> >>> tips.
>>> >>> Note that slow and fast are not equivalent since the fast version
>>> only
>>> >>> accept `A` or `B` but the slow version accepts any subtype of
>>> feature
>>> >>> that you may ever define. Therefore there's no way the compiler can
>>> >>> rewrite the slow version to the fast version.
>>> >>> There are optimizations that can be applied to bring down the gap
>>> but
>>> >>> there'll always be a large difference between the two.
>>> >>>
>>> >>> >
>>> >>> >
>>> >>> > Thank you in advance for any guidance.
>>> >>> >
>>> >>> >
>>> >>> > -Tim
>>> >>> >
>>> >>> >
>>> >>> >
>>> >>> >
>>> >>> >
>>>
>>