[julia-users] Re: Does union() imply worse performance?

2015-05-30 Thread Seth
Steven,

Thanks. Is it then the recommended/usual practice to have one main 
function with every possible argument you might want, and then several 
methods that provide specific dispatch and pass just the arguments relevant 
to that method? That is,

function dijkstra_shortest_paths(graph, use_dists, edge_dists) ... # this 
is where the algorithm is implemented

dijkstra_shortest_paths(graph, edge_dists::AbstractArray{Float64,2}) = 
dijkstra_shortest_paths(graph, true, edge_dists)
dijkstra_shortest_paths(graph) = dijkstra_shortest_paths(graph, false, 
Array{Float64,2}())

?

On Friday, May 29, 2015 at 6:49:45 PM UTC-7, Steven G. Johnson wrote:

 *No!*  This is one of the most common misconceptions about Julia 
 programming.

 The type declarations in function arguments have *no impact* on 
 performance.  Zero.  Nada.  Zip.  You *don't have to declare a type at 
 all* in the function argument, and it *still* won't matter for 
 performance.

 The argument types are just a filter for when the function is applicable.

 The first time a function is called, a specialized version is compiled for 
 the types of the arguments that you pass it.  Subsequently, when you call 
 it with arguments of the same type, the specialized version is called.

 Note also that a default argument foo(x, y=false) is exactly equivalent to 
 defining

 foo(x,y) = ...
 foo(x) = foo(x, false)

 So, if you call foo(x, [1,2,3]), it calls a version of foo(x,y) 
 specialized for an Array{Int} in the second argument.  The existence of a 
 version of foo specialized for a boolean y is irrelevant.



[julia-users] Re: Does union() imply worse performance?

2015-05-30 Thread John Myles White
The NullableArrays work is very far behind schedule. I developed RSI right 
after announcing the work on NullableArrays and am still recovering, which 
means that I can spend very little time working on Julia code these days.

I'll give you more details offline.

 -- John

On Saturday, May 30, 2015 at 10:48:10 AM UTC-7, David Gold wrote:

 Thank you for the link and the explanation, John -- it's definitely 
 helpful. Is current work with Nullable and data structures available 
 anywhere in JuliaStats, or is it being developed elsewhere?

 On Saturday, May 30, 2015 at 12:23:09 PM UTC-4, John Myles White wrote:

 David,

 To clarify your understanding of what's wrong with DataArrays, check out 
 the DataArray code for something like getindex(): 
 https://github.com/JuliaStats/DataArrays.jl/blob/master/src/indexing.jl#L109 
 https://www.google.com/url?q=https%3A%2F%2Fgithub.com%2FJuliaStats%2FDataArrays.jl%2Fblob%2Fmaster%2Fsrc%2Findexing.jl%23L109sa=Dsntz=1usg=AFQjCNHy0P-zaAlH7SKIUtbSOUgb1zfpcw

 I don't have a full understanding of Julia's type inference system, but 
 here's my best attempt to explain my current understanding of the system 
 and how it affects Seth's original example.

 Consider two simple functions, f and g, and their application inside a 
 larger function, gf():

 # Given pre-existing definitions such that:
 #
 # f(input::R) = output::S
 # g(input::S) = output::T
 #
 # What can we infer about the following larger function?
 function gf(x::Any)
 return g(f(x))
 end

 The important questions to ask are about what we can infer at 
 method-compile-time for gf(). Specifically, ask:

 (1) Can we determine the type S given the type R, which is currently 
 bound to the type of the specific value of x on which we called gf()? (Note 
 that it was the act of calling gf(x) on a specific value that triggered the 
 entire method-compilation process.)

 (2) Can we determine that the type S is a specific concrete type? 
 Concreteness matters, because we're going to have to think about how the 
 output of f() affects the input of g(). In particular, we need to know 
 whether we need to perform run-time dispatch inside of gf() or whether all 
 dispatch inside of gf() can be determined statically given the type of 
 gf()'s argument x.

 (3) Assuming that we successfully determined a concrete type S given R, 
 can we repeat the process for g() to yield a concrete type for T? If so, 
 then we'll be able to infer, at least for one specific type R, the concrete 
 output type of gf(x). If not, we'll have to give looser bounds on the 
 concrete types that come out of gf() given an input of a specific value 
 like our current x. That would be important if we were going to call gf() 
 inside another function.

 Hope that helps.

  -- John

 On Saturday, May 30, 2015 at 4:51:09 AM UTC-7, David Gold wrote:

 @Steven,

 Would you help me to understand the difference between this case here 
 and the case of DataArray{T}s -- which, by my understanding, are basically 
 AbstractArray{Union{T, NaN}, 1}'s? My first thought was that taking a 
 Union{Bool, AbstractArray{Float, 2}} argument would potentially interfere 
 with the compiler's ability to perform type inference, similar to how 
 looping through a DataArray can experience a cost from the compiler having 
 to deal with possible NaNs. 

 But what you're saying is that this does not apply here, since 
 presumably the argument, whether it is a Bool or an AbstractArray, would be 
 type-stable throughout the functions operations -- unlike the values 
 contained in a DataArray. Would it be fair to say that dealing with Union{} 
 types tends to be dangerous to performance mostly when they are looped over 
 in some sort of container, since in that case it's not a matter of simply 
 dispatching a specially compiled method on one of the conjunct types or the 
 other?

 On Friday, May 29, 2015 at 9:49:45 PM UTC-4, Steven G. Johnson wrote:

 *No!*  This is one of the most common misconceptions about Julia 
 programming.

 The type declarations in function arguments have *no impact* on 
 performance.  Zero.  Nada.  Zip.  You *don't have to declare a type at 
 all* in the function argument, and it *still* won't matter for 
 performance.

 The argument types are just a filter for when the function is 
 applicable.

 The first time a function is called, a specialized version is compiled 
 for the types of the arguments that you pass it.  Subsequently, when you 
 call it with arguments of the same type, the specialized version is called.

 Note also that a default argument foo(x, y=false) is exactly equivalent 
 to defining

 foo(x,y) = ...
 foo(x) = foo(x, false)

 So, if you call foo(x, [1,2,3]), it calls a version of foo(x,y) 
 specialized for an Array{Int} in the second argument.  The existence of a 
 version of foo specialized for a boolean y is irrelevant.



[julia-users] Re: Does union() imply worse performance?

2015-05-30 Thread David Gold
Thank you for the link and the explanation, John -- it's definitely 
helpful. Is current work with Nullable and data structures available 
anywhere in JuliaStats, or is it being developed elsewhere?

On Saturday, May 30, 2015 at 12:23:09 PM UTC-4, John Myles White wrote:

 David,

 To clarify your understanding of what's wrong with DataArrays, check out 
 the DataArray code for something like getindex(): 
 https://github.com/JuliaStats/DataArrays.jl/blob/master/src/indexing.jl#L109 
 https://www.google.com/url?q=https%3A%2F%2Fgithub.com%2FJuliaStats%2FDataArrays.jl%2Fblob%2Fmaster%2Fsrc%2Findexing.jl%23L109sa=Dsntz=1usg=AFQjCNHy0P-zaAlH7SKIUtbSOUgb1zfpcw

 I don't have a full understanding of Julia's type inference system, but 
 here's my best attempt to explain my current understanding of the system 
 and how it affects Seth's original example.

 Consider two simple functions, f and g, and their application inside a 
 larger function, gf():

 # Given pre-existing definitions such that:
 #
 # f(input::R) = output::S
 # g(input::S) = output::T
 #
 # What can we infer about the following larger function?
 function gf(x::Any)
 return g(f(x))
 end

 The important questions to ask are about what we can infer at 
 method-compile-time for gf(). Specifically, ask:

 (1) Can we determine the type S given the type R, which is currently bound 
 to the type of the specific value of x on which we called gf()? (Note that 
 it was the act of calling gf(x) on a specific value that triggered the 
 entire method-compilation process.)

 (2) Can we determine that the type S is a specific concrete type? 
 Concreteness matters, because we're going to have to think about how the 
 output of f() affects the input of g(). In particular, we need to know 
 whether we need to perform run-time dispatch inside of gf() or whether all 
 dispatch inside of gf() can be determined statically given the type of 
 gf()'s argument x.

 (3) Assuming that we successfully determined a concrete type S given R, 
 can we repeat the process for g() to yield a concrete type for T? If so, 
 then we'll be able to infer, at least for one specific type R, the concrete 
 output type of gf(x). If not, we'll have to give looser bounds on the 
 concrete types that come out of gf() given an input of a specific value 
 like our current x. That would be important if we were going to call gf() 
 inside another function.

 Hope that helps.

  -- John

 On Saturday, May 30, 2015 at 4:51:09 AM UTC-7, David Gold wrote:

 @Steven,

 Would you help me to understand the difference between this case here and 
 the case of DataArray{T}s -- which, by my understanding, are basically 
 AbstractArray{Union{T, NaN}, 1}'s? My first thought was that taking a 
 Union{Bool, AbstractArray{Float, 2}} argument would potentially interfere 
 with the compiler's ability to perform type inference, similar to how 
 looping through a DataArray can experience a cost from the compiler having 
 to deal with possible NaNs. 

 But what you're saying is that this does not apply here, since presumably 
 the argument, whether it is a Bool or an AbstractArray, would be 
 type-stable throughout the functions operations -- unlike the values 
 contained in a DataArray. Would it be fair to say that dealing with Union{} 
 types tends to be dangerous to performance mostly when they are looped over 
 in some sort of container, since in that case it's not a matter of simply 
 dispatching a specially compiled method on one of the conjunct types or the 
 other?

 On Friday, May 29, 2015 at 9:49:45 PM UTC-4, Steven G. Johnson wrote:

 *No!*  This is one of the most common misconceptions about Julia 
 programming.

 The type declarations in function arguments have *no impact* on 
 performance.  Zero.  Nada.  Zip.  You *don't have to declare a type at 
 all* in the function argument, and it *still* won't matter for 
 performance.

 The argument types are just a filter for when the function is applicable.

 The first time a function is called, a specialized version is compiled 
 for the types of the arguments that you pass it.  Subsequently, when you 
 call it with arguments of the same type, the specialized version is called.

 Note also that a default argument foo(x, y=false) is exactly equivalent 
 to defining

 foo(x,y) = ...
 foo(x) = foo(x, false)

 So, if you call foo(x, [1,2,3]), it calls a version of foo(x,y) 
 specialized for an Array{Int} in the second argument.  The existence of a 
 version of foo specialized for a boolean y is irrelevant.



[julia-users] Re: Does union() imply worse performance?

2015-05-30 Thread Tobias Knopp
There is one exception though, which is keyword arguments

Am Samstag, 30. Mai 2015 03:49:45 UTC+2 schrieb Steven G. Johnson:

 *No!*  This is one of the most common misconceptions about Julia 
 programming.

 The type declarations in function arguments have *no impact* on 
 performance.  Zero.  Nada.  Zip.  You *don't have to declare a type at 
 all* in the function argument, and it *still* won't matter for 
 performance.

 The argument types are just a filter for when the function is applicable.

 The first time a function is called, a specialized version is compiled for 
 the types of the arguments that you pass it.  Subsequently, when you call 
 it with arguments of the same type, the specialized version is called.

 Note also that a default argument foo(x, y=false) is exactly equivalent to 
 defining

 foo(x,y) = ...
 foo(x) = foo(x, false)

 So, if you call foo(x, [1,2,3]), it calls a version of foo(x,y) 
 specialized for an Array{Int} in the second argument.  The existence of a 
 version of foo specialized for a boolean y is irrelevant.



Re: [julia-users] Re: Does union() imply worse performance?

2015-05-30 Thread Tobias Knopp
Yes thats true but thats the future and currently not in stable Julia.

Am Samstag, 30. Mai 2015 12:39:10 UTC+2 schrieb Jiahao Chen:

 For this use case of optionally present data, Nullable would seem 
 appropriate (although this is 0.4-only).


 http://julia.readthedocs.org/en/latest/manual/types/#nullable-types-representing-missing-values

 Thanks,

 Jiahao Chen
 Research Scientist
 MIT CSAIL

 On Sat, May 30, 2015 at 2:08 PM, Tobias Knopp tobias...@googlemail.com 
 javascript: wrote:

 There is one exception though, which is keyword arguments


 Am Samstag, 30. Mai 2015 03:49:45 UTC+2 schrieb Steven G. Johnson:

 *No!*  This is one of the most common misconceptions about Julia 
 programming.

 The type declarations in function arguments have *no impact* on 
 performance.  Zero.  Nada.  Zip.  You *don't have to declare a type at 
 all* in the function argument, and it *still* won't matter for 
 performance.

 The argument types are just a filter for when the function is applicable.

 The first time a function is called, a specialized version is compiled 
 for the types of the arguments that you pass it.  Subsequently, when you 
 call it with arguments of the same type, the specialized version is called.

 Note also that a default argument foo(x, y=false) is exactly equivalent 
 to defining

 foo(x,y) = ...
 foo(x) = foo(x, false)

 So, if you call foo(x, [1,2,3]), it calls a version of foo(x,y) 
 specialized for an Array{Int} in the second argument.  The existence of a 
 version of foo specialized for a boolean y is irrelevant.




Re: [julia-users] Re: Does union() imply worse performance?

2015-05-30 Thread Tim Holy
https://github.com/JuliaLang/Compat.jl#new-types

--Tim

On Saturday, May 30, 2015 04:50:07 AM Tobias Knopp wrote:
 Yes thats true but thats the future and currently not in stable Julia.
 
 Am Samstag, 30. Mai 2015 12:39:10 UTC+2 schrieb Jiahao Chen:
  For this use case of optionally present data, Nullable would seem
  appropriate (although this is 0.4-only).
  
  
  http://julia.readthedocs.org/en/latest/manual/types/#nullable-types-repres
  enting-missing-values
  
  Thanks,
  
  Jiahao Chen
  Research Scientist
  MIT CSAIL
  
  On Sat, May 30, 2015 at 2:08 PM, Tobias Knopp tobias...@googlemail.com
  
  javascript: wrote:
  There is one exception though, which is keyword arguments
  
  Am Samstag, 30. Mai 2015 03:49:45 UTC+2 schrieb Steven G. Johnson:
  *No!*  This is one of the most common misconceptions about Julia
  programming.
  
  The type declarations in function arguments have *no impact* on
  performance.  Zero.  Nada.  Zip.  You *don't have to declare a type at
  all* in the function argument, and it *still* won't matter for
  performance.
  
  The argument types are just a filter for when the function is
  applicable.
  
  The first time a function is called, a specialized version is compiled
  for the types of the arguments that you pass it.  Subsequently, when you
  call it with arguments of the same type, the specialized version is
  called.
  
  Note also that a default argument foo(x, y=false) is exactly equivalent
  to defining
  
  foo(x,y) = ...
  foo(x) = foo(x, false)
  
  So, if you call foo(x, [1,2,3]), it calls a version of foo(x,y)
  specialized for an Array{Int} in the second argument.  The existence of
  a
  version of foo specialized for a boolean y is irrelevant.



[julia-users] Re: Does union() imply worse performance?

2015-05-30 Thread David Gold
@Steven,

Would you help me to understand the difference between this case here and 
the case of DataArray{T}s -- which, by my understanding, are basically 
AbstractArray{Union{T, NaN}, 1}'s? My first thought was that taking a 
Union{Bool, AbstractArray{Float, 2}} argument would potentially interfere 
with the compiler's ability to perform type inference, similar to how 
looping through a DataArray can experience a cost from the compiler having 
to deal with possible NaNs. 

But what you're saying is that this does not apply here, since presumably 
the argument, whether it is a Bool or an AbstractArray, would be 
type-stable throughout the functions operations -- unlike the values 
contained in a DataArray. Would it be fair to say that dealing with Union{} 
types tends to be dangerous to performance mostly when they are looped over 
in some sort of container, since in that case it's not a matter of simply 
dispatching a specially compiled method on one of the conjunct types or the 
other?

On Friday, May 29, 2015 at 9:49:45 PM UTC-4, Steven G. Johnson wrote:

 *No!*  This is one of the most common misconceptions about Julia 
 programming.

 The type declarations in function arguments have *no impact* on 
 performance.  Zero.  Nada.  Zip.  You *don't have to declare a type at 
 all* in the function argument, and it *still* won't matter for 
 performance.

 The argument types are just a filter for when the function is applicable.

 The first time a function is called, a specialized version is compiled for 
 the types of the arguments that you pass it.  Subsequently, when you call 
 it with arguments of the same type, the specialized version is called.

 Note also that a default argument foo(x, y=false) is exactly equivalent to 
 defining

 foo(x,y) = ...
 foo(x) = foo(x, false)

 So, if you call foo(x, [1,2,3]), it calls a version of foo(x,y) 
 specialized for an Array{Int} in the second argument.  The existence of a 
 version of foo specialized for a boolean y is irrelevant.



Re: [julia-users] Re: Does union() imply worse performance?

2015-05-30 Thread Jiahao Chen
For this use case of optionally present data, Nullable would seem
appropriate (although this is 0.4-only).

http://julia.readthedocs.org/en/latest/manual/types/#nullable-types-representing-missing-values

Thanks,

Jiahao Chen
Research Scientist
MIT CSAIL

On Sat, May 30, 2015 at 2:08 PM, Tobias Knopp tobias.kn...@googlemail.com
wrote:

 There is one exception though, which is keyword arguments


 Am Samstag, 30. Mai 2015 03:49:45 UTC+2 schrieb Steven G. Johnson:

 *No!*  This is one of the most common misconceptions about Julia
 programming.

 The type declarations in function arguments have *no impact* on
 performance.  Zero.  Nada.  Zip.  You *don't have to declare a type at
 all* in the function argument, and it *still* won't matter for
 performance.

 The argument types are just a filter for when the function is applicable.

 The first time a function is called, a specialized version is compiled
 for the types of the arguments that you pass it.  Subsequently, when you
 call it with arguments of the same type, the specialized version is called.

 Note also that a default argument foo(x, y=false) is exactly equivalent
 to defining

 foo(x,y) = ...
 foo(x) = foo(x, false)

 So, if you call foo(x, [1,2,3]), it calls a version of foo(x,y)
 specialized for an Array{Int} in the second argument.  The existence of a
 version of foo specialized for a boolean y is irrelevant.




[julia-users] Re: Does union() imply worse performance?

2015-05-30 Thread John Myles White
David,

To clarify your understanding of what's wrong with DataArrays, check out 
the DataArray code for something like 
getindex(): 
https://github.com/JuliaStats/DataArrays.jl/blob/master/src/indexing.jl#L109

I don't have a full understanding of Julia's type inference system, but 
here's my best attempt to explain my current understanding of the system 
and how it affects Seth's original example.

Consider two simple functions, f and g, and their application inside a 
larger function, gf():

# Given pre-existing definitions such that:
#
# f(input::R) = output::S
# g(input::S) = output::T
#
# What can we infer about the following larger function?
function gf(x::Any)
return g(f(x))
end

The important questions to ask are about what we can infer at 
method-compile-time for gf(). Specifically, ask:

(1) Can we determine the type S given the type R, which is currently bound 
to the type of the specific value of x on which we called gf()? (Note that 
it was the act of calling gf(x) on a specific value that triggered the 
entire method-compilation process.)

(2) Can we determine that the type S is a specific concrete type? 
Concreteness matters, because we're going to have to think about how the 
output of f() affects the input of g(). In particular, we need to know 
whether we need to perform run-time dispatch inside of gf() or whether all 
dispatch inside of gf() can be determined statically given the type of 
gf()'s argument x.

(3) Assuming that we successfully determined a concrete type S given R, can 
we repeat the process for g() to yield a concrete type for T? If so, then 
we'll be able to infer, at least for one specific type R, the concrete 
output type of gf(x). If not, we'll have to give looser bounds on the 
concrete types that come out of gf() given an input of a specific value 
like our current x. That would be important if we were going to call gf() 
inside another function.

Hope that helps.

 -- John

On Saturday, May 30, 2015 at 4:51:09 AM UTC-7, David Gold wrote:

 @Steven,

 Would you help me to understand the difference between this case here and 
 the case of DataArray{T}s -- which, by my understanding, are basically 
 AbstractArray{Union{T, NaN}, 1}'s? My first thought was that taking a 
 Union{Bool, AbstractArray{Float, 2}} argument would potentially interfere 
 with the compiler's ability to perform type inference, similar to how 
 looping through a DataArray can experience a cost from the compiler having 
 to deal with possible NaNs. 

 But what you're saying is that this does not apply here, since presumably 
 the argument, whether it is a Bool or an AbstractArray, would be 
 type-stable throughout the functions operations -- unlike the values 
 contained in a DataArray. Would it be fair to say that dealing with Union{} 
 types tends to be dangerous to performance mostly when they are looped over 
 in some sort of container, since in that case it's not a matter of simply 
 dispatching a specially compiled method on one of the conjunct types or the 
 other?

 On Friday, May 29, 2015 at 9:49:45 PM UTC-4, Steven G. Johnson wrote:

 *No!*  This is one of the most common misconceptions about Julia 
 programming.

 The type declarations in function arguments have *no impact* on 
 performance.  Zero.  Nada.  Zip.  You *don't have to declare a type at 
 all* in the function argument, and it *still* won't matter for 
 performance.

 The argument types are just a filter for when the function is applicable.

 The first time a function is called, a specialized version is compiled 
 for the types of the arguments that you pass it.  Subsequently, when you 
 call it with arguments of the same type, the specialized version is called.

 Note also that a default argument foo(x, y=false) is exactly equivalent 
 to defining

 foo(x,y) = ...
 foo(x) = foo(x, false)

 So, if you call foo(x, [1,2,3]), it calls a version of foo(x,y) 
 specialized for an Array{Int} in the second argument.  The existence of a 
 version of foo specialized for a boolean y is irrelevant.



[julia-users] Re: Does union() imply worse performance?

2015-05-29 Thread Steven G. Johnson
*No!*  This is one of the most common misconceptions about Julia 
programming.

The type declarations in function arguments have *no impact* on 
performance.  Zero.  Nada.  Zip.  You *don't have to declare a type at all* 
in the function argument, and it *still* won't matter for performance.

The argument types are just a filter for when the function is applicable.

The first time a function is called, a specialized version is compiled for 
the types of the arguments that you pass it.  Subsequently, when you call 
it with arguments of the same type, the specialized version is called.

Note also that a default argument foo(x, y=false) is exactly equivalent to 
defining

foo(x,y) = ...
foo(x) = foo(x, false)

So, if you call foo(x, [1,2,3]), it calls a version of foo(x,y) specialized 
for an Array{Int} in the second argument.  The existence of a version of 
foo specialized for a boolean y is irrelevant.


[julia-users] Re: Does union() imply worse performance?

2015-05-29 Thread Páll Haraldsson
I'm pretty sure yes. And it seems like a little strange thing to do..

You can use the code_native function to see the code and I did something 
wrong so I'm not sure I confirmed, but you could try out with and without 
bool (and Union). If you get lots of code then the length (longer) should 
be a hint..

Another thing I recall from the (performance) docs. You may want to check 
the bool and do one thing and then call another that does your array work 
that doesn't take a union. I think that applies here.

I've read a lot on Julia - but kind of a newbie.. so to not just trust me 
:) Someone might also confirm..

When I first read this, I thought the bool would be for for each individual 
value of the array. That would also be slow.. Often a NaN or other value is 
used it that case. Reserving a value would be slow and slower than NaN. 
Could you do the same as you are doing and have an array of only NaN be 
your signal? That could be just a 1x1 array.. but be sure to test if there 
are exceptions.. Anyway all NaN array is probably never helpful for you?

-- 
Palli.

On Friday, May 29, 2015 at 7:50:26 PM UTC, Seth wrote:

 I have an function that takes an optional (potentially very large) matrix 
 (AbstractArray{Float64,2}). I'd like to create a parameter to the function 
 that's EITHER this array, or a boolean (false). If I define

 edge_dists::Union(Bool, AbstractArray{Float64,2}) = false



 in my function declaration, would that cause significant performance 
 penalties when accessing the parameter?