Thanks Cedric and Yichao.
This makes sense that there might be new subtypes and associated
specialised methods. I understand that now. Thanks.
On my machine (v0.4.5 Windows), fast() and pretty_fast() seem to run in
similar time.
So I looked as @code_warntype as Yichao suggested and get the following.
I don't fully know how to interpret output, but return type from the final
"catchall" evaluate() seems to be inferred/asserted as Float64 (see
highlighted yellow line below)
Would this explain why pretty_fast() seems to be as efficient as fast()?
Why is the return type being inferred asserted as Float64?
julia> @code_warntype fast(features)
Variables:
features::Array{Feature,1}
retval::Float64
#s1::Int64
i::Int64
Body:
begin # none, line 2:
retval = 0.0 # none, line 3:
GenSym(2) = (Base.arraylen)(features::Array{Feature,1})::Int64
GenSym(0) = $(Expr(:new, UnitRange{Int64}, 1,
:(((top(getfield))(Base.Intrinsics,:select_value)::I)((Base.sle_int)(1,GenSym(2))::Bool,GenSym(2),(Base.box)
(Int64,(Base.sub_int)(1,1)))::Int64)))
#s1 = (top(getfield))(GenSym(0),:start)::Int64
unless (Base.box)(Base.Bool,(Base.not_int)(#s1::Int64 ===
(Base.box)(Base.Int,(Base.add_int)((top(getfield))(GenSym(0),:stop)::Int64,1))::Bool))
goto 1
2:
GenSym(3) = #s1::Int64
GenSym(4) = (Base.box)(Base.Int,(Base.add_int)(#s1::Int64,1))
i = GenSym(3)
#s1 = GenSym(4) # none, line 4:
unless
(Main.isa)((Base.arrayref)(features::Array{Feature,1},i::Int64)::Feature,Main.A)::Bool
goto 4 # none, line 5:
retval =
(Base.box)(Base.Float64,(Base.add_float)(retval::Float64,1.0))
goto 5
4: # none, line 7:
retval =
(Base.box)(Base.Float64,(Base.add_float)(retval::Float64,0.0))
5:
3:
unless
(Base.box)(Base.Bool,(Base.not_int)((Base.box)(Base.Bool,(Base.not_int)(#s1::Int64
=== (Base.box)(Base.Int,(Base.add_int)((top(getfield))(GenSym(0)
,:stop)::Int64,1))::Bool)))) goto 2
1:
0: # none, line 10:
return retval::Float64
end::Float64
julia> @code_warntype pretty_fast(features)
Variables:
features::Array{Feature,1}
retval::Float64
#s1::Int64
i::Int64
Body:
begin # none, line 2:
retval = 0.0 # none, line 3:
GenSym(2) = (Base.arraylen)(features::Array{Feature,1})::Int64
GenSym(0) = $(Expr(:new, UnitRange{Int64}, 1,
:(((top(getfield))(Base.Intrinsics,:select_value)::I)((Base.sle_int)(1,GenSym(2))::Bool,GenSym(2),(Base.box)
(Int64,(Base.sub_int)(1,1)))::Int64)))
#s1 = (top(getfield))(GenSym(0),:start)::Int64
unless (Base.box)(Base.Bool,(Base.not_int)(#s1::Int64 ===
(Base.box)(Base.Int,(Base.add_int)((top(getfield))(GenSym(0),:stop)::Int64,1))::Bool))
goto 1
2:
GenSym(4) = #s1::Int64
GenSym(5) = (Base.box)(Base.Int,(Base.add_int)(#s1::Int64,1))
i = GenSym(4)
#s1 = GenSym(5) # none, line 4:
unless
(Main.isa)((Base.arrayref)(features::Array{Feature,1},i::Int64)::Feature,Main.A)::Bool
goto 4 # none, line 5:
retval =
(Base.box)(Base.Float64,(Base.add_float)(retval::Float64,1.0))
goto 6
4: # none, line 6:
unless
(Main.isa)((Base.arrayref)(features::Array{Feature,1},i::Int64)::Feature,Main.B)::Bool
goto 5 # none, line 7:
retval =
(Base.box)(Base.Float64,(Base.add_float)(retval::Float64,0.0))
goto 6
5: # none, line 9:
GenSym(3) =
(Main.evaluate)((Base.arrayref)(features::Array{Feature,1},i::Int64)::Feature)::Float64
retval =
(Base.box)(Base.Float64,(Base.add_float)(retval::Float64,GenSym(3)))
6:
3:
unless
(Base.box)(Base.Bool,(Base.not_int)((Base.box)(Base.Bool,(Base.not_int)(#s1::Int64
=== (Base.box)(Base.Int,(Base.add_int)((top(getfield))(GenSym(0)
,:stop)::Int64,1))::Bool)))) goto 2
1:
0: # none, line 12:
return retval::Float64
end::Float64
Julia>
On Sunday, April 3, 2016 at 8:04:35 AM UTC+10, Cedric St-Jean wrote:
>
>
> On Saturday, April 2, 2016 at 5:39:45 PM UTC-4, Greg Plowman wrote:
>>
>> Cedric,
>> On my machine fast() and pretty_fast() run in the roughly the same time.
>> Are you sure pre-compiled first?
>>
>
> Yep. I'm using @benchmark on Julia 0.4.5 on OSX, and the time difference
> is definitely > 2X.
>
>
>>
>> 1. If you add a default fallback method, say, evaluate(f::Feature) = -1.0
>> Theoretically, would that inform the compiler that evaluate() always
>> returns a Float64 and therefore is type stable?
>>
>
> No, because I might write
>
> type C <: Feature
> end
> evaluate(f::C) = 100
>
> and a Vector{Feature} might contain a C. Abstract types are open-ended,
> new types can be created a runtime, so the compiler can't assume anything
> about them, which is why they're not useful for performance. See here
> <http://docs.julialang.org/en/release-0.4/manual/performance-tips/#avoid-fields-with-abstract-type>
> .
>
>
>
>> Does dynamic dispatch mean compiler has to look up method at run time
>> (because it can't know type ahead of time).
>> This is equivalent to explicitly coding "static dispatch" with if
>> statements?
>> If so then why is it so much slower?
>> Providing a fallback method as in 1. or explicitly returning Float64
>> doesn't seem to improve speed, which naively suggests that slowness is from
>> dynamic dispatch not from type instability???
>>
>> 3. Also, on my machine pretty_fast() is as fast as fast(). Why is this
>> so, if pretty_fast() is supposedly type unstable?
>>
>> As you can probably tell, I'm pretty confused on this.
>>
>>
>> On Sunday, April 3, 2016 at 6:34:09 AM UTC+10, Cedric St-Jean wrote:
>>
>>> Thank you for the detailed explanation. I tried it out:
>>>
>>> function pretty_fast(features::Vector{Feature})
>>> retval = 0.0
>>> for i in 1 : length(features)
>>> if isa(features[i], A)
>>> x = evaluate(features[i]::A)
>>> elseif isa(features[i], B)
>>> x = evaluate(features[i]::B)
>>> else
>>> x = evaluate(features[i])
>>> end
>>> retval += x
>>> end
>>> retval
>>> end
>>>
>>> On my laptop, fast runs in 10 microseconds, pretty_fast in 30, and slow
>>> in 210.
>>>
>>> On Saturday, April 2, 2016 at 12:24:18 PM UTC-4, Yichao Yu wrote:
>>>>
>>>> On Sat, Apr 2, 2016 at 12:16 PM, Tim Wheeler <[email protected]>
>>>> wrote:
>>>> > Thank you for the comments. In my original code it means the
>>>> difference
>>>> > between a 30 min execution with memory allocation in the Gigabytes
>>>> and a few
>>>> > seconds of execution with only 800 bytes using the second version.
>>>> > I thought under-the-hood Julia basically runs those if statements
>>>> anyway for
>>>> > its dispatch, and don't know why it needs to allocate any memory.
>>>> > Having the if-statement workaround will be fine though.
>>>>
>>>> Well, if you have a lot of these cheap functions being dynamically
>>>> dispatched I think it is not a good way to use the type. Depending on
>>>> your problem, you may be better off using a enum/flags/dict to
>>>> represent the type/get the values.
>>>>
>>>> The reason for the allocation is that the return type is unknown. It
>>>> should be obvious to see if you check your code with code_warntype.
>>>>
>>>> >
>>>> > On Saturday, April 2, 2016 at 7:26:11 AM UTC-7, Cedric St-Jean wrote:
>>>> >>
>>>> >>
>>>> >>> Therefore there's no way the compiler can rewrite the slow version
>>>> to the
>>>> >>> fast version.
>>>> >>
>>>> >>
>>>> >> It knows that the element type is a Feature, so it could produce:
>>>> >>
>>>> >> if isa(features[i], A)
>>>> >> retval += evaluate(features[i]::A)
>>>> >> elseif isa(features[i], B)
>>>> >> retval += evaluate(features[i]::B)
>>>> >> else
>>>> >> retval += evaluate(features[i])
>>>> >> end
>>>> >>
>>>> >> and it would make sense for abstract types that have few subtypes. I
>>>> >> didn't realize that dispatch was an order of magnitude slower than
>>>> type
>>>> >> checking. It's easy enough to write a macro generating this
>>>> expansion, too.
>>>> >>
>>>> >> On Saturday, April 2, 2016 at 2:05:20 AM UTC-4, Yichao Yu wrote:
>>>> >>>
>>>> >>> On Fri, Apr 1, 2016 at 9:56 PM, Tim Wheeler <[email protected]>
>>>>
>>>> >>> wrote:
>>>> >>> > Hello Julia Users.
>>>> >>> >
>>>> >>> > I ran into a weird slowdown issue and reproduced a minimal
>>>> working
>>>> >>> > example.
>>>> >>> > Maybe someone can help shed some light.
>>>> >>> >
>>>> >>> > abstract Feature
>>>> >>> >
>>>> >>> > type A <: Feature end
>>>> >>> > evaluate(f::A) = 1.0
>>>> >>> >
>>>> >>> > type B <: Feature end
>>>> >>> > evaluate(f::B) = 0.0
>>>> >>> >
>>>> >>> > function slow(features::Vector{Feature})
>>>> >>> > retval = 0.0
>>>> >>> > for i in 1 : length(features)
>>>> >>> > retval += evaluate(features[i])
>>>> >>> > end
>>>> >>> > retval
>>>> >>> > end
>>>> >>> >
>>>> >>> > function fast(features::Vector{Feature})
>>>> >>> > retval = 0.0
>>>> >>> > for i in 1 : length(features)
>>>> >>> > if isa(features[i], A)
>>>> >>> > retval += evaluate(features[i]::A)
>>>> >>> > else
>>>> >>> > retval += evaluate(features[i]::B)
>>>> >>> > end
>>>> >>> > end
>>>> >>> > retval
>>>> >>> > end
>>>> >>> >
>>>> >>> > using ProfileView
>>>> >>> >
>>>> >>> > features = Feature[]
>>>> >>> > for i in 1 : 10000
>>>> >>> > push!(features, A())
>>>> >>> > end
>>>> >>> >
>>>> >>> > slow(features)
>>>> >>> > @time slow(features)
>>>> >>> > fast(features)
>>>> >>> > @time fast(features)
>>>> >>> >
>>>> >>> > The output is:
>>>> >>> >
>>>> >>> > 0.000136 seconds (10.15 k allocations: 166.417 KB)
>>>> >>> > 0.000012 seconds (5 allocations: 176 bytes)
>>>> >>> >
>>>> >>> >
>>>> >>> > This is a HUGE difference! Am I missing something big? Is there a
>>>> good
>>>> >>> > way
>>>> >>> > to inspect code to figure out where I am going wrong?
>>>> >>>
>>>> >>> This is because of type instability as you will find in the
>>>> performance
>>>> >>> tips.
>>>> >>> Note that slow and fast are not equivalent since the fast version
>>>> only
>>>> >>> accept `A` or `B` but the slow version accepts any subtype of
>>>> feature
>>>> >>> that you may ever define. Therefore there's no way the compiler can
>>>> >>> rewrite the slow version to the fast version.
>>>> >>> There are optimizations that can be applied to bring down the gap
>>>> but
>>>> >>> there'll always be a large difference between the two.
>>>> >>>
>>>> >>> >
>>>> >>> >
>>>> >>> > Thank you in advance for any guidance.
>>>> >>> >
>>>> >>> >
>>>> >>> > -Tim
>>>> >>> >
>>>> >>> >
>>>> >>> >
>>>> >>> >
>>>> >>> >
>>>>
>>>