[julia-users] Re: Does union() imply worse performance?

David Gold Sat, 30 May 2015 10:49:15 -0700

Thank you for the link and the explanation, John -- it's definitely 
helpful. Is current work with Nullable and data structures available 
anywhere in JuliaStats, or is it being developed elsewhere?


On Saturday, May 30, 2015 at 12:23:09 PM UTC-4, John Myles White wrote:
>
> David,
>
> To clarify your understanding of what's wrong with DataArrays, check out 
> the DataArray code for something like getindex(): 
> https://github.com/JuliaStats/DataArrays.jl/blob/master/src/indexing.jl#L109 
> <https://www.google.com/url?q=https%3A%2F%2Fgithub.com%2FJuliaStats%2FDataArrays.jl%2Fblob%2Fmaster%2Fsrc%2Findexing.jl%23L109&sa=D&sntz=1&usg=AFQjCNHy0P-zaAlH7SKIUtbSOUgb1zfpcw>
>
> I don't have a full understanding of Julia's type inference system, but 
> here's my best attempt to explain my current understanding of the system 
> and how it affects Seth's original example.
>
> Consider two simple functions, f and g, and their application inside a 
> larger function, gf():
>
> # Given pre-existing definitions such that:
> #
> # f(input::R) => output::S
> # g(input::S) => output::T
> #
> # What can we infer about the following larger function?
> function gf(x::Any)
>     return g(f(x))
> end
>
> The important questions to ask are about what we can infer at 
> method-compile-time for gf(). Specifically, ask:
>
> (1) Can we determine the type S given the type R, which is currently bound 
> to the type of the specific value of x on which we called gf()? (Note that 
> it was the act of calling gf(x) on a specific value that triggered the 
> entire method-compilation process.)
>
> (2) Can we determine that the type S is a specific concrete type? 
> Concreteness matters, because we're going to have to think about how the 
> output of f() affects the input of g(). In particular, we need to know 
> whether we need to perform run-time dispatch inside of gf() or whether all 
> dispatch inside of gf() can be determined statically given the type of 
> gf()'s argument x.
>
> (3) Assuming that we successfully determined a concrete type S given R, 
> can we repeat the process for g() to yield a concrete type for T? If so, 
> then we'll be able to infer, at least for one specific type R, the concrete 
> output type of gf(x). If not, we'll have to give looser bounds on the 
> concrete types that come out of gf() given an input of a specific value 
> like our current x. That would be important if we were going to call gf() 
> inside another function.
>
> Hope that helps.
>
>  -- John
>
> On Saturday, May 30, 2015 at 4:51:09 AM UTC-7, David Gold wrote:
>>
>> @Steven,
>>
>> Would you help me to understand the difference between this case here and 
>> the case of DataArray{T}s -- which, by my understanding, are basically 
>> AbstractArray{Union{T, NaN}, 1}'s? My first thought was that taking a 
>> Union{Bool, AbstractArray{Float, 2}} argument would potentially interfere 
>> with the compiler's ability to perform type inference, similar to how 
>> looping through a DataArray can experience a cost from the compiler having 
>> to deal with possible NaNs. 
>>
>> But what you're saying is that this does not apply here, since presumably 
>> the argument, whether it is a Bool or an AbstractArray, would be 
>> type-stable throughout the functions operations -- unlike the values 
>> contained in a DataArray. Would it be fair to say that dealing with Union{} 
>> types tends to be dangerous to performance mostly when they are looped over 
>> in some sort of container, since in that case it's not a matter of simply 
>> dispatching a specially compiled method on one of the conjunct types or the 
>> other?
>>
>> On Friday, May 29, 2015 at 9:49:45 PM UTC-4, Steven G. Johnson wrote:
>>>
>>> *No!*  This is one of the most common misconceptions about Julia 
>>> programming.
>>>
>>> The type declarations in function arguments have *no impact* on 
>>> performance.  Zero.  Nada.  Zip.  You *don't have to declare a type at 
>>> all* in the function argument, and it *still* won't matter for 
>>> performance.
>>>
>>> The argument types are just a filter for when the function is applicable.
>>>
>>> The first time a function is called, a specialized version is compiled 
>>> for the types of the arguments that you pass it.  Subsequently, when you 
>>> call it with arguments of the same type, the specialized version is called.
>>>
>>> Note also that a default argument foo(x, y=false) is exactly equivalent 
>>> to defining
>>>
>>>     foo(x,y) = ...
>>>     foo(x) = foo(x, false)
>>>
>>> So, if you call foo(x, [1,2,3]), it calls a version of foo(x,y) 
>>> specialized for an Array{Int} in the second argument.  The existence of a 
>>> version of foo specialized for a boolean y is irrelevant.
>>>
>>

[julia-users] Re: Does union() imply worse performance?

Reply via email to