[julia-users] Re: Does union() imply worse performance?
Steven, Thanks. Is it then the recommended/usual practice to have one main function with every possible argument you might want, and then several methods that provide specific dispatch and pass just the arguments relevant to that method? That is, function dijkstra_shortest_paths(graph, use_dists, edge_dists) ... # this is where the algorithm is implemented dijkstra_shortest_paths(graph, edge_dists::AbstractArray{Float64,2}) = dijkstra_shortest_paths(graph, true, edge_dists) dijkstra_shortest_paths(graph) = dijkstra_shortest_paths(graph, false, Array{Float64,2}()) ? On Friday, May 29, 2015 at 6:49:45 PM UTC-7, Steven G. Johnson wrote: *No!* This is one of the most common misconceptions about Julia programming. The type declarations in function arguments have *no impact* on performance. Zero. Nada. Zip. You *don't have to declare a type at all* in the function argument, and it *still* won't matter for performance. The argument types are just a filter for when the function is applicable. The first time a function is called, a specialized version is compiled for the types of the arguments that you pass it. Subsequently, when you call it with arguments of the same type, the specialized version is called. Note also that a default argument foo(x, y=false) is exactly equivalent to defining foo(x,y) = ... foo(x) = foo(x, false) So, if you call foo(x, [1,2,3]), it calls a version of foo(x,y) specialized for an Array{Int} in the second argument. The existence of a version of foo specialized for a boolean y is irrelevant.
[julia-users] Re: Does union() imply worse performance?
The NullableArrays work is very far behind schedule. I developed RSI right after announcing the work on NullableArrays and am still recovering, which means that I can spend very little time working on Julia code these days. I'll give you more details offline. -- John On Saturday, May 30, 2015 at 10:48:10 AM UTC-7, David Gold wrote: Thank you for the link and the explanation, John -- it's definitely helpful. Is current work with Nullable and data structures available anywhere in JuliaStats, or is it being developed elsewhere? On Saturday, May 30, 2015 at 12:23:09 PM UTC-4, John Myles White wrote: David, To clarify your understanding of what's wrong with DataArrays, check out the DataArray code for something like getindex(): https://github.com/JuliaStats/DataArrays.jl/blob/master/src/indexing.jl#L109 https://www.google.com/url?q=https%3A%2F%2Fgithub.com%2FJuliaStats%2FDataArrays.jl%2Fblob%2Fmaster%2Fsrc%2Findexing.jl%23L109sa=Dsntz=1usg=AFQjCNHy0P-zaAlH7SKIUtbSOUgb1zfpcw I don't have a full understanding of Julia's type inference system, but here's my best attempt to explain my current understanding of the system and how it affects Seth's original example. Consider two simple functions, f and g, and their application inside a larger function, gf(): # Given pre-existing definitions such that: # # f(input::R) = output::S # g(input::S) = output::T # # What can we infer about the following larger function? function gf(x::Any) return g(f(x)) end The important questions to ask are about what we can infer at method-compile-time for gf(). Specifically, ask: (1) Can we determine the type S given the type R, which is currently bound to the type of the specific value of x on which we called gf()? (Note that it was the act of calling gf(x) on a specific value that triggered the entire method-compilation process.) (2) Can we determine that the type S is a specific concrete type? Concreteness matters, because we're going to have to think about how the output of f() affects the input of g(). In particular, we need to know whether we need to perform run-time dispatch inside of gf() or whether all dispatch inside of gf() can be determined statically given the type of gf()'s argument x. (3) Assuming that we successfully determined a concrete type S given R, can we repeat the process for g() to yield a concrete type for T? If so, then we'll be able to infer, at least for one specific type R, the concrete output type of gf(x). If not, we'll have to give looser bounds on the concrete types that come out of gf() given an input of a specific value like our current x. That would be important if we were going to call gf() inside another function. Hope that helps. -- John On Saturday, May 30, 2015 at 4:51:09 AM UTC-7, David Gold wrote: @Steven, Would you help me to understand the difference between this case here and the case of DataArray{T}s -- which, by my understanding, are basically AbstractArray{Union{T, NaN}, 1}'s? My first thought was that taking a Union{Bool, AbstractArray{Float, 2}} argument would potentially interfere with the compiler's ability to perform type inference, similar to how looping through a DataArray can experience a cost from the compiler having to deal with possible NaNs. But what you're saying is that this does not apply here, since presumably the argument, whether it is a Bool or an AbstractArray, would be type-stable throughout the functions operations -- unlike the values contained in a DataArray. Would it be fair to say that dealing with Union{} types tends to be dangerous to performance mostly when they are looped over in some sort of container, since in that case it's not a matter of simply dispatching a specially compiled method on one of the conjunct types or the other? On Friday, May 29, 2015 at 9:49:45 PM UTC-4, Steven G. Johnson wrote: *No!* This is one of the most common misconceptions about Julia programming. The type declarations in function arguments have *no impact* on performance. Zero. Nada. Zip. You *don't have to declare a type at all* in the function argument, and it *still* won't matter for performance. The argument types are just a filter for when the function is applicable. The first time a function is called, a specialized version is compiled for the types of the arguments that you pass it. Subsequently, when you call it with arguments of the same type, the specialized version is called. Note also that a default argument foo(x, y=false) is exactly equivalent to defining foo(x,y) = ... foo(x) = foo(x, false) So, if you call foo(x, [1,2,3]), it calls a version of foo(x,y) specialized for an Array{Int} in the second argument. The existence of a version of foo specialized for a boolean y is irrelevant.
[julia-users] Re: Does union() imply worse performance?
Thank you for the link and the explanation, John -- it's definitely helpful. Is current work with Nullable and data structures available anywhere in JuliaStats, or is it being developed elsewhere? On Saturday, May 30, 2015 at 12:23:09 PM UTC-4, John Myles White wrote: David, To clarify your understanding of what's wrong with DataArrays, check out the DataArray code for something like getindex(): https://github.com/JuliaStats/DataArrays.jl/blob/master/src/indexing.jl#L109 https://www.google.com/url?q=https%3A%2F%2Fgithub.com%2FJuliaStats%2FDataArrays.jl%2Fblob%2Fmaster%2Fsrc%2Findexing.jl%23L109sa=Dsntz=1usg=AFQjCNHy0P-zaAlH7SKIUtbSOUgb1zfpcw I don't have a full understanding of Julia's type inference system, but here's my best attempt to explain my current understanding of the system and how it affects Seth's original example. Consider two simple functions, f and g, and their application inside a larger function, gf(): # Given pre-existing definitions such that: # # f(input::R) = output::S # g(input::S) = output::T # # What can we infer about the following larger function? function gf(x::Any) return g(f(x)) end The important questions to ask are about what we can infer at method-compile-time for gf(). Specifically, ask: (1) Can we determine the type S given the type R, which is currently bound to the type of the specific value of x on which we called gf()? (Note that it was the act of calling gf(x) on a specific value that triggered the entire method-compilation process.) (2) Can we determine that the type S is a specific concrete type? Concreteness matters, because we're going to have to think about how the output of f() affects the input of g(). In particular, we need to know whether we need to perform run-time dispatch inside of gf() or whether all dispatch inside of gf() can be determined statically given the type of gf()'s argument x. (3) Assuming that we successfully determined a concrete type S given R, can we repeat the process for g() to yield a concrete type for T? If so, then we'll be able to infer, at least for one specific type R, the concrete output type of gf(x). If not, we'll have to give looser bounds on the concrete types that come out of gf() given an input of a specific value like our current x. That would be important if we were going to call gf() inside another function. Hope that helps. -- John On Saturday, May 30, 2015 at 4:51:09 AM UTC-7, David Gold wrote: @Steven, Would you help me to understand the difference between this case here and the case of DataArray{T}s -- which, by my understanding, are basically AbstractArray{Union{T, NaN}, 1}'s? My first thought was that taking a Union{Bool, AbstractArray{Float, 2}} argument would potentially interfere with the compiler's ability to perform type inference, similar to how looping through a DataArray can experience a cost from the compiler having to deal with possible NaNs. But what you're saying is that this does not apply here, since presumably the argument, whether it is a Bool or an AbstractArray, would be type-stable throughout the functions operations -- unlike the values contained in a DataArray. Would it be fair to say that dealing with Union{} types tends to be dangerous to performance mostly when they are looped over in some sort of container, since in that case it's not a matter of simply dispatching a specially compiled method on one of the conjunct types or the other? On Friday, May 29, 2015 at 9:49:45 PM UTC-4, Steven G. Johnson wrote: *No!* This is one of the most common misconceptions about Julia programming. The type declarations in function arguments have *no impact* on performance. Zero. Nada. Zip. You *don't have to declare a type at all* in the function argument, and it *still* won't matter for performance. The argument types are just a filter for when the function is applicable. The first time a function is called, a specialized version is compiled for the types of the arguments that you pass it. Subsequently, when you call it with arguments of the same type, the specialized version is called. Note also that a default argument foo(x, y=false) is exactly equivalent to defining foo(x,y) = ... foo(x) = foo(x, false) So, if you call foo(x, [1,2,3]), it calls a version of foo(x,y) specialized for an Array{Int} in the second argument. The existence of a version of foo specialized for a boolean y is irrelevant.
[julia-users] Re: Does union() imply worse performance?
There is one exception though, which is keyword arguments Am Samstag, 30. Mai 2015 03:49:45 UTC+2 schrieb Steven G. Johnson: *No!* This is one of the most common misconceptions about Julia programming. The type declarations in function arguments have *no impact* on performance. Zero. Nada. Zip. You *don't have to declare a type at all* in the function argument, and it *still* won't matter for performance. The argument types are just a filter for when the function is applicable. The first time a function is called, a specialized version is compiled for the types of the arguments that you pass it. Subsequently, when you call it with arguments of the same type, the specialized version is called. Note also that a default argument foo(x, y=false) is exactly equivalent to defining foo(x,y) = ... foo(x) = foo(x, false) So, if you call foo(x, [1,2,3]), it calls a version of foo(x,y) specialized for an Array{Int} in the second argument. The existence of a version of foo specialized for a boolean y is irrelevant.
Re: [julia-users] Re: Does union() imply worse performance?
Yes thats true but thats the future and currently not in stable Julia. Am Samstag, 30. Mai 2015 12:39:10 UTC+2 schrieb Jiahao Chen: For this use case of optionally present data, Nullable would seem appropriate (although this is 0.4-only). http://julia.readthedocs.org/en/latest/manual/types/#nullable-types-representing-missing-values Thanks, Jiahao Chen Research Scientist MIT CSAIL On Sat, May 30, 2015 at 2:08 PM, Tobias Knopp tobias...@googlemail.com javascript: wrote: There is one exception though, which is keyword arguments Am Samstag, 30. Mai 2015 03:49:45 UTC+2 schrieb Steven G. Johnson: *No!* This is one of the most common misconceptions about Julia programming. The type declarations in function arguments have *no impact* on performance. Zero. Nada. Zip. You *don't have to declare a type at all* in the function argument, and it *still* won't matter for performance. The argument types are just a filter for when the function is applicable. The first time a function is called, a specialized version is compiled for the types of the arguments that you pass it. Subsequently, when you call it with arguments of the same type, the specialized version is called. Note also that a default argument foo(x, y=false) is exactly equivalent to defining foo(x,y) = ... foo(x) = foo(x, false) So, if you call foo(x, [1,2,3]), it calls a version of foo(x,y) specialized for an Array{Int} in the second argument. The existence of a version of foo specialized for a boolean y is irrelevant.
Re: [julia-users] Re: Does union() imply worse performance?
https://github.com/JuliaLang/Compat.jl#new-types --Tim On Saturday, May 30, 2015 04:50:07 AM Tobias Knopp wrote: Yes thats true but thats the future and currently not in stable Julia. Am Samstag, 30. Mai 2015 12:39:10 UTC+2 schrieb Jiahao Chen: For this use case of optionally present data, Nullable would seem appropriate (although this is 0.4-only). http://julia.readthedocs.org/en/latest/manual/types/#nullable-types-repres enting-missing-values Thanks, Jiahao Chen Research Scientist MIT CSAIL On Sat, May 30, 2015 at 2:08 PM, Tobias Knopp tobias...@googlemail.com javascript: wrote: There is one exception though, which is keyword arguments Am Samstag, 30. Mai 2015 03:49:45 UTC+2 schrieb Steven G. Johnson: *No!* This is one of the most common misconceptions about Julia programming. The type declarations in function arguments have *no impact* on performance. Zero. Nada. Zip. You *don't have to declare a type at all* in the function argument, and it *still* won't matter for performance. The argument types are just a filter for when the function is applicable. The first time a function is called, a specialized version is compiled for the types of the arguments that you pass it. Subsequently, when you call it with arguments of the same type, the specialized version is called. Note also that a default argument foo(x, y=false) is exactly equivalent to defining foo(x,y) = ... foo(x) = foo(x, false) So, if you call foo(x, [1,2,3]), it calls a version of foo(x,y) specialized for an Array{Int} in the second argument. The existence of a version of foo specialized for a boolean y is irrelevant.
[julia-users] Re: Does union() imply worse performance?
@Steven, Would you help me to understand the difference between this case here and the case of DataArray{T}s -- which, by my understanding, are basically AbstractArray{Union{T, NaN}, 1}'s? My first thought was that taking a Union{Bool, AbstractArray{Float, 2}} argument would potentially interfere with the compiler's ability to perform type inference, similar to how looping through a DataArray can experience a cost from the compiler having to deal with possible NaNs. But what you're saying is that this does not apply here, since presumably the argument, whether it is a Bool or an AbstractArray, would be type-stable throughout the functions operations -- unlike the values contained in a DataArray. Would it be fair to say that dealing with Union{} types tends to be dangerous to performance mostly when they are looped over in some sort of container, since in that case it's not a matter of simply dispatching a specially compiled method on one of the conjunct types or the other? On Friday, May 29, 2015 at 9:49:45 PM UTC-4, Steven G. Johnson wrote: *No!* This is one of the most common misconceptions about Julia programming. The type declarations in function arguments have *no impact* on performance. Zero. Nada. Zip. You *don't have to declare a type at all* in the function argument, and it *still* won't matter for performance. The argument types are just a filter for when the function is applicable. The first time a function is called, a specialized version is compiled for the types of the arguments that you pass it. Subsequently, when you call it with arguments of the same type, the specialized version is called. Note also that a default argument foo(x, y=false) is exactly equivalent to defining foo(x,y) = ... foo(x) = foo(x, false) So, if you call foo(x, [1,2,3]), it calls a version of foo(x,y) specialized for an Array{Int} in the second argument. The existence of a version of foo specialized for a boolean y is irrelevant.
Re: [julia-users] Re: Does union() imply worse performance?
For this use case of optionally present data, Nullable would seem appropriate (although this is 0.4-only). http://julia.readthedocs.org/en/latest/manual/types/#nullable-types-representing-missing-values Thanks, Jiahao Chen Research Scientist MIT CSAIL On Sat, May 30, 2015 at 2:08 PM, Tobias Knopp tobias.kn...@googlemail.com wrote: There is one exception though, which is keyword arguments Am Samstag, 30. Mai 2015 03:49:45 UTC+2 schrieb Steven G. Johnson: *No!* This is one of the most common misconceptions about Julia programming. The type declarations in function arguments have *no impact* on performance. Zero. Nada. Zip. You *don't have to declare a type at all* in the function argument, and it *still* won't matter for performance. The argument types are just a filter for when the function is applicable. The first time a function is called, a specialized version is compiled for the types of the arguments that you pass it. Subsequently, when you call it with arguments of the same type, the specialized version is called. Note also that a default argument foo(x, y=false) is exactly equivalent to defining foo(x,y) = ... foo(x) = foo(x, false) So, if you call foo(x, [1,2,3]), it calls a version of foo(x,y) specialized for an Array{Int} in the second argument. The existence of a version of foo specialized for a boolean y is irrelevant.
[julia-users] Re: Does union() imply worse performance?
David, To clarify your understanding of what's wrong with DataArrays, check out the DataArray code for something like getindex(): https://github.com/JuliaStats/DataArrays.jl/blob/master/src/indexing.jl#L109 I don't have a full understanding of Julia's type inference system, but here's my best attempt to explain my current understanding of the system and how it affects Seth's original example. Consider two simple functions, f and g, and their application inside a larger function, gf(): # Given pre-existing definitions such that: # # f(input::R) = output::S # g(input::S) = output::T # # What can we infer about the following larger function? function gf(x::Any) return g(f(x)) end The important questions to ask are about what we can infer at method-compile-time for gf(). Specifically, ask: (1) Can we determine the type S given the type R, which is currently bound to the type of the specific value of x on which we called gf()? (Note that it was the act of calling gf(x) on a specific value that triggered the entire method-compilation process.) (2) Can we determine that the type S is a specific concrete type? Concreteness matters, because we're going to have to think about how the output of f() affects the input of g(). In particular, we need to know whether we need to perform run-time dispatch inside of gf() or whether all dispatch inside of gf() can be determined statically given the type of gf()'s argument x. (3) Assuming that we successfully determined a concrete type S given R, can we repeat the process for g() to yield a concrete type for T? If so, then we'll be able to infer, at least for one specific type R, the concrete output type of gf(x). If not, we'll have to give looser bounds on the concrete types that come out of gf() given an input of a specific value like our current x. That would be important if we were going to call gf() inside another function. Hope that helps. -- John On Saturday, May 30, 2015 at 4:51:09 AM UTC-7, David Gold wrote: @Steven, Would you help me to understand the difference between this case here and the case of DataArray{T}s -- which, by my understanding, are basically AbstractArray{Union{T, NaN}, 1}'s? My first thought was that taking a Union{Bool, AbstractArray{Float, 2}} argument would potentially interfere with the compiler's ability to perform type inference, similar to how looping through a DataArray can experience a cost from the compiler having to deal with possible NaNs. But what you're saying is that this does not apply here, since presumably the argument, whether it is a Bool or an AbstractArray, would be type-stable throughout the functions operations -- unlike the values contained in a DataArray. Would it be fair to say that dealing with Union{} types tends to be dangerous to performance mostly when they are looped over in some sort of container, since in that case it's not a matter of simply dispatching a specially compiled method on one of the conjunct types or the other? On Friday, May 29, 2015 at 9:49:45 PM UTC-4, Steven G. Johnson wrote: *No!* This is one of the most common misconceptions about Julia programming. The type declarations in function arguments have *no impact* on performance. Zero. Nada. Zip. You *don't have to declare a type at all* in the function argument, and it *still* won't matter for performance. The argument types are just a filter for when the function is applicable. The first time a function is called, a specialized version is compiled for the types of the arguments that you pass it. Subsequently, when you call it with arguments of the same type, the specialized version is called. Note also that a default argument foo(x, y=false) is exactly equivalent to defining foo(x,y) = ... foo(x) = foo(x, false) So, if you call foo(x, [1,2,3]), it calls a version of foo(x,y) specialized for an Array{Int} in the second argument. The existence of a version of foo specialized for a boolean y is irrelevant.
[julia-users] Re: Does union() imply worse performance?
*No!* This is one of the most common misconceptions about Julia programming. The type declarations in function arguments have *no impact* on performance. Zero. Nada. Zip. You *don't have to declare a type at all* in the function argument, and it *still* won't matter for performance. The argument types are just a filter for when the function is applicable. The first time a function is called, a specialized version is compiled for the types of the arguments that you pass it. Subsequently, when you call it with arguments of the same type, the specialized version is called. Note also that a default argument foo(x, y=false) is exactly equivalent to defining foo(x,y) = ... foo(x) = foo(x, false) So, if you call foo(x, [1,2,3]), it calls a version of foo(x,y) specialized for an Array{Int} in the second argument. The existence of a version of foo specialized for a boolean y is irrelevant.
[julia-users] Re: Does union() imply worse performance?
I'm pretty sure yes. And it seems like a little strange thing to do.. You can use the code_native function to see the code and I did something wrong so I'm not sure I confirmed, but you could try out with and without bool (and Union). If you get lots of code then the length (longer) should be a hint.. Another thing I recall from the (performance) docs. You may want to check the bool and do one thing and then call another that does your array work that doesn't take a union. I think that applies here. I've read a lot on Julia - but kind of a newbie.. so to not just trust me :) Someone might also confirm.. When I first read this, I thought the bool would be for for each individual value of the array. That would also be slow.. Often a NaN or other value is used it that case. Reserving a value would be slow and slower than NaN. Could you do the same as you are doing and have an array of only NaN be your signal? That could be just a 1x1 array.. but be sure to test if there are exceptions.. Anyway all NaN array is probably never helpful for you? -- Palli. On Friday, May 29, 2015 at 7:50:26 PM UTC, Seth wrote: I have an function that takes an optional (potentially very large) matrix (AbstractArray{Float64,2}). I'd like to create a parameter to the function that's EITHER this array, or a boolean (false). If I define edge_dists::Union(Bool, AbstractArray{Float64,2}) = false in my function declaration, would that cause significant performance penalties when accessing the parameter?