Thanks Cedric and Yichao.

This makes sense that there might be new subtypes and associated 
specialised methods. I understand that now. Thanks.

On my machine (v0.4.5 Windows), fast() and pretty_fast() seem to run in 
similar time.
So I looked as @code_warntype as Yichao suggested and get the following.
I don't fully know how to interpret output, but return type from the final 
"catchall" evaluate() seems to be inferred/asserted as Float64 (see 
highlighted yellow line below)

Would this explain why pretty_fast() seems to be as efficient as fast()?

Why is the return type being inferred asserted as Float64?


julia> @code_warntype fast(features)
Variables:
  features::Array{Feature,1}
  retval::Float64
  #s1::Int64
  i::Int64

Body:
  begin  # none, line 2:
      retval = 0.0 # none, line 3:
      GenSym(2) = (Base.arraylen)(features::Array{Feature,1})::Int64
      GenSym(0) = $(Expr(:new, UnitRange{Int64}, 1, 
:(((top(getfield))(Base.Intrinsics,:select_value)::I)((Base.sle_int)(1,GenSym(2))::Bool,GenSym(2),(Base.box)
(Int64,(Base.sub_int)(1,1)))::Int64)))
      #s1 = (top(getfield))(GenSym(0),:start)::Int64
      unless (Base.box)(Base.Bool,(Base.not_int)(#s1::Int64 === 
(Base.box)(Base.Int,(Base.add_int)((top(getfield))(GenSym(0),:stop)::Int64,1))::Bool))
 
goto 1
      2:
      GenSym(3) = #s1::Int64
      GenSym(4) = (Base.box)(Base.Int,(Base.add_int)(#s1::Int64,1))
      i = GenSym(3)
      #s1 = GenSym(4) # none, line 4:
      unless 
(Main.isa)((Base.arrayref)(features::Array{Feature,1},i::Int64)::Feature,Main.A)::Bool
 
goto 4 # none, line 5:
      retval = 
(Base.box)(Base.Float64,(Base.add_float)(retval::Float64,1.0))
      goto 5
      4:  # none, line 7:
      retval = 
(Base.box)(Base.Float64,(Base.add_float)(retval::Float64,0.0))
      5:
      3:
      unless 
(Base.box)(Base.Bool,(Base.not_int)((Base.box)(Base.Bool,(Base.not_int)(#s1::Int64
 
=== (Base.box)(Base.Int,(Base.add_int)((top(getfield))(GenSym(0)
,:stop)::Int64,1))::Bool)))) goto 2
      1:
      0:  # none, line 10:
      return retval::Float64
  end::Float64


julia> @code_warntype pretty_fast(features)
Variables:
  features::Array{Feature,1}
  retval::Float64
  #s1::Int64
  i::Int64

Body:
  begin  # none, line 2:
      retval = 0.0 # none, line 3:
      GenSym(2) = (Base.arraylen)(features::Array{Feature,1})::Int64
      GenSym(0) = $(Expr(:new, UnitRange{Int64}, 1, 
:(((top(getfield))(Base.Intrinsics,:select_value)::I)((Base.sle_int)(1,GenSym(2))::Bool,GenSym(2),(Base.box)
(Int64,(Base.sub_int)(1,1)))::Int64)))
      #s1 = (top(getfield))(GenSym(0),:start)::Int64
      unless (Base.box)(Base.Bool,(Base.not_int)(#s1::Int64 === 
(Base.box)(Base.Int,(Base.add_int)((top(getfield))(GenSym(0),:stop)::Int64,1))::Bool))
 
goto 1
      2:
      GenSym(4) = #s1::Int64
      GenSym(5) = (Base.box)(Base.Int,(Base.add_int)(#s1::Int64,1))
      i = GenSym(4)
      #s1 = GenSym(5) # none, line 4:
      unless 
(Main.isa)((Base.arrayref)(features::Array{Feature,1},i::Int64)::Feature,Main.A)::Bool
 
goto 4 # none, line 5:
      retval = 
(Base.box)(Base.Float64,(Base.add_float)(retval::Float64,1.0))
      goto 6
      4:  # none, line 6:
      unless 
(Main.isa)((Base.arrayref)(features::Array{Feature,1},i::Int64)::Feature,Main.B)::Bool
 
goto 5 # none, line 7:
      retval = 
(Base.box)(Base.Float64,(Base.add_float)(retval::Float64,0.0))
      goto 6
      5:  # none, line 9:
      GenSym(3) = 
(Main.evaluate)((Base.arrayref)(features::Array{Feature,1},i::Int64)::Feature)::Float64
      retval = 
(Base.box)(Base.Float64,(Base.add_float)(retval::Float64,GenSym(3)))
      6:
      3:
      unless 
(Base.box)(Base.Bool,(Base.not_int)((Base.box)(Base.Bool,(Base.not_int)(#s1::Int64
 
=== (Base.box)(Base.Int,(Base.add_int)((top(getfield))(GenSym(0)
,:stop)::Int64,1))::Bool)))) goto 2
      1:
      0:  # none, line 12:
      return retval::Float64
  end::Float64

Julia>



On Sunday, April 3, 2016 at 8:04:35 AM UTC+10, Cedric St-Jean wrote:

>
>
> On Saturday, April 2, 2016 at 5:39:45 PM UTC-4, Greg Plowman wrote:
>>
>> Cedric,
>> On my machine fast() and pretty_fast() run in the roughly the same time.
>> Are you sure pre-compiled first?
>>
>
> Yep. I'm using @benchmark on Julia 0.4.5 on OSX, and the time difference 
> is definitely > 2X. 
>  
>
>>  
>> 1. If you add a default fallback method, say, evaluate(f::Feature) = -1.0
>> Theoretically, would that inform the compiler that evaluate() always 
>> returns a Float64 and therefore is type stable?
>>
>
> No, because I might write 
>
> type C <: Feature
> end
> evaluate(f::C) = 100
>
> and a Vector{Feature} might contain a C. Abstract types are open-ended, 
> new types can be created a runtime, so the compiler can't assume anything 
> about them, which is why they're not useful for performance. See here 
> <http://docs.julialang.org/en/release-0.4/manual/performance-tips/#avoid-fields-with-abstract-type>
> .
>
>  
>
>> Does dynamic dispatch mean compiler has to look up method at run time 
>> (because it can't know type ahead of time).
>> This is equivalent to explicitly coding "static dispatch" with if 
>> statements?
>> If so then why is it so much slower?
>> Providing a fallback method as in 1. or explicitly returning Float64 
>> doesn't seem to improve speed, which naively suggests that slowness is from 
>> dynamic dispatch not from type instability???
>>
>> 3. Also, on my machine pretty_fast() is as fast as fast(). Why is this 
>> so, if pretty_fast() is supposedly type unstable?
>>
>> As you can probably tell, I'm pretty confused on this.
>>
>>
>> On Sunday, April 3, 2016 at 6:34:09 AM UTC+10, Cedric St-Jean wrote:
>>
>>> Thank you for the detailed explanation. I tried it out:
>>>
>>> function pretty_fast(features::Vector{Feature})
>>>     retval = 0.0
>>>     for i in 1 : length(features)
>>>         if isa(features[i], A)
>>>             x = evaluate(features[i]::A)
>>>         elseif isa(features[i], B)
>>>             x = evaluate(features[i]::B)
>>>         else
>>>             x = evaluate(features[i])
>>>         end
>>>         retval += x
>>>     end
>>>     retval
>>> end
>>>
>>> On my laptop, fast runs in 10 microseconds, pretty_fast in 30, and slow 
>>> in 210.
>>>
>>> On Saturday, April 2, 2016 at 12:24:18 PM UTC-4, Yichao Yu wrote:
>>>>
>>>> On Sat, Apr 2, 2016 at 12:16 PM, Tim Wheeler <[email protected]> 
>>>> wrote: 
>>>> > Thank you for the comments. In my original code it means the 
>>>> difference 
>>>> > between a 30 min execution with memory allocation in the Gigabytes 
>>>> and a few 
>>>> > seconds of execution with only 800 bytes using the second version. 
>>>> > I thought under-the-hood Julia basically runs those if statements 
>>>> anyway for 
>>>> > its dispatch, and don't know why it needs to allocate any memory. 
>>>> > Having the if-statement workaround will be fine though. 
>>>>
>>>> Well, if you have a lot of these cheap functions being dynamically 
>>>> dispatched I think it is not a good way to use the type. Depending on 
>>>> your problem, you may be better off using a enum/flags/dict to 
>>>> represent the type/get the values. 
>>>>
>>>> The reason for the allocation is that the return type is unknown. It 
>>>> should be obvious to see if you check your code with code_warntype. 
>>>>
>>>> > 
>>>> > On Saturday, April 2, 2016 at 7:26:11 AM UTC-7, Cedric St-Jean wrote: 
>>>> >> 
>>>> >> 
>>>> >>> Therefore there's no way the compiler can rewrite the slow version 
>>>> to the 
>>>> >>> fast version. 
>>>> >> 
>>>> >> 
>>>> >> It knows that the element type is a Feature, so it could produce: 
>>>> >> 
>>>> >> if isa(features[i], A) 
>>>> >>     retval += evaluate(features[i]::A) 
>>>> >> elseif isa(features[i], B) 
>>>> >>     retval += evaluate(features[i]::B) 
>>>> >> else 
>>>> >>     retval += evaluate(features[i]) 
>>>> >> end 
>>>> >> 
>>>> >> and it would make sense for abstract types that have few subtypes. I 
>>>> >> didn't realize that dispatch was an order of magnitude slower than 
>>>> type 
>>>> >> checking. It's easy enough to write a macro generating this 
>>>> expansion, too. 
>>>> >> 
>>>> >> On Saturday, April 2, 2016 at 2:05:20 AM UTC-4, Yichao Yu wrote: 
>>>> >>> 
>>>> >>> On Fri, Apr 1, 2016 at 9:56 PM, Tim Wheeler <[email protected]> 
>>>>
>>>> >>> wrote: 
>>>> >>> > Hello Julia Users. 
>>>> >>> > 
>>>> >>> > I ran into a weird slowdown issue and reproduced a minimal 
>>>> working 
>>>> >>> > example. 
>>>> >>> > Maybe someone can help shed some light. 
>>>> >>> > 
>>>> >>> > abstract Feature 
>>>> >>> > 
>>>> >>> > type A <: Feature end 
>>>> >>> > evaluate(f::A) = 1.0 
>>>> >>> > 
>>>> >>> > type B <: Feature end 
>>>> >>> > evaluate(f::B) = 0.0 
>>>> >>> > 
>>>> >>> > function slow(features::Vector{Feature}) 
>>>> >>> >     retval = 0.0 
>>>> >>> >     for i in 1 : length(features) 
>>>> >>> >         retval += evaluate(features[i]) 
>>>> >>> >     end 
>>>> >>> >     retval 
>>>> >>> > end 
>>>> >>> > 
>>>> >>> > function fast(features::Vector{Feature}) 
>>>> >>> >     retval = 0.0 
>>>> >>> >     for i in 1 : length(features) 
>>>> >>> >         if isa(features[i], A) 
>>>> >>> >             retval += evaluate(features[i]::A) 
>>>> >>> >         else 
>>>> >>> >             retval += evaluate(features[i]::B) 
>>>> >>> >         end 
>>>> >>> >     end 
>>>> >>> >     retval 
>>>> >>> > end 
>>>> >>> > 
>>>> >>> > using ProfileView 
>>>> >>> > 
>>>> >>> > features = Feature[] 
>>>> >>> > for i in 1 : 10000 
>>>> >>> >     push!(features, A()) 
>>>> >>> > end 
>>>> >>> > 
>>>> >>> > slow(features) 
>>>> >>> > @time slow(features) 
>>>> >>> > fast(features) 
>>>> >>> > @time fast(features) 
>>>> >>> > 
>>>> >>> > The output is: 
>>>> >>> > 
>>>> >>> > 0.000136 seconds (10.15 k allocations: 166.417 KB) 
>>>> >>> > 0.000012 seconds (5 allocations: 176 bytes) 
>>>> >>> > 
>>>> >>> > 
>>>> >>> > This is a HUGE difference! Am I missing something big? Is there a 
>>>> good 
>>>> >>> > way 
>>>> >>> > to inspect code to figure out where I am going wrong? 
>>>> >>> 
>>>> >>> This is because of type instability as you will find in the 
>>>> performance 
>>>> >>> tips. 
>>>> >>> Note that slow and fast are not equivalent since the fast version 
>>>> only 
>>>> >>> accept `A` or `B` but the slow version accepts any subtype of 
>>>> feature 
>>>> >>> that you may ever define. Therefore there's no way the compiler can 
>>>> >>> rewrite the slow version to the fast version. 
>>>> >>> There are optimizations that can be applied to bring down the gap 
>>>> but 
>>>> >>> there'll always be a large difference between the two. 
>>>> >>> 
>>>> >>> > 
>>>> >>> > 
>>>> >>> > Thank you in advance for any guidance. 
>>>> >>> > 
>>>> >>> > 
>>>> >>> > -Tim 
>>>> >>> > 
>>>> >>> > 
>>>> >>> > 
>>>> >>> > 
>>>> >>> > 
>>>>
>>>

Reply via email to