Re: [julia-users] why sum(abs(A)) is very slow
If A is not a global variable (i.e within a function), @devec would be much faster (comparable to sumabs) Dahua On Monday, August 25, 2014 4:26:22 AM UTC+8, Adam Smith wrote: I've run into this a few times (and a few hundred times in python), so I made an @iterize macro. Not sure how useful it is, but you can put it in front of a bunch of chained function calls and it will make iterators automatically to avoid creating any temp arrays: A = randn(126) @time sum(abs(A)) @time @iterize sum(abs(A)) @time sumabs(A) println(sum(abs(A))) println(@iterize sum(abs(A))) println(sumabs(A)) println(sum(A)) println(@iterize sum(A)) println(sum(ceil(floor(abs(A) println(@iterize sum(ceil(floor(abs(A) Output: elapsed time: 0.367873796 seconds (537878296 bytes allocated, 2.48% gc time) elapsed time: 0.107278414 seconds (577616 bytes allocated) elapsed time: 0.045590637 seconds (639580 bytes allocated) 5.3551932868680775e7 5.3551932868672036e7 5.3551932868678436e7 658.6904827808266 658.6904827808266 2.4537098e7 2.4537098e7 The macro is in a gist: Iterize.jl https://gist.github.com/sunetos/f311d5408854e65d7ff9 I had tried using @devec, but that actually made it about 100x slower. On Saturday, August 23, 2014 8:15:44 AM UTC-4, Stefan Karpinski wrote: On Sat, Aug 23, 2014 at 7:23 AM, gael@gmail.com wrote: To do any of that justice, you end up with a language that looks basically like Haskell. So why not just use Haskell? Because I don't know anything about it (yet), except the name and the fact that you often associated it with lazy evaluation. Because (#2), this could be a way to make sumabs and the likes obsolete in *Julia*. :) We do really want to get rid of things like sumabs., so it's certainly worth considering. I know I've thought about it many times, but I don't think it's the right answer – you really want to preserve eager evaluation semantics, even if you end up moving around the actual evaluation of things.
Re: [julia-users] why sum(abs(A)) is very slow
There's a complicated limit to when you want to fuse loops – at some point multiple iterations becomes better than fused loops and it all depends on how much and what kind of work you're doing. In general doing things lazily does not cut down on allocation since you have to allocate the representation of the operations that you're deferring and close over any values that they depend on. This particular example only works out so well because the iterable is so simple that the compiler can eliminate the laziness and do the eager loop fused version for you. This will not generally be the case. Thank you for taking so much time to explain and for your patience! You're welcome to experiment (and Julia's type system makes it pretty easy to do so), but I think that you'll quickly find that more laziness is not a panacea for performance problems. My question came partly from Python3 having lazy map and reduce. But having the choice is good and in Julia all laziness can be provided now by imap etc. If someone has an (self-contained) example where lazy element-wise computations is worse than eager, please post! (I'm interested in understanding better the above mentioned limit)
Re: [julia-users] why sum(abs(A)) is very slow
(I was also thinking about element-wise operations)
Re: [julia-users] why sum(abs(A)) is very slow
There is a sumabs function in Base for this reason. We'd like to eventually be able to do stream fusion to make the vectorized version as efficient as the manually fused version, but for now there's a performance gap. Note that the vectorized version is the same speed you would get in other languages where you express this in vectorized form – it's just that you can get much faster with manual loop fusion. On Thu, Aug 21, 2014 at 11:03 PM, John Myles White johnmyleswh...@gmail.com wrote: Please read http://julialang.org/blog/2013/09/fast-numeric/ — John On Aug 21, 2014, at 8:02 PM, K Leo cnbiz...@gmail.com wrote: A is a 1-dimensional array. I used to compute sum(abs(A)). But when I changed to the following, the speed increased nearly 10 fold. Why is that? sumA=0 for i=1:length(A) sumA = sumA + abs(A[i]) end
Re: [julia-users] why sum(abs(A)) is very slow
Yes, that works nicely. Obviously it would be even nicer not to have to do that :-) On Fri, Aug 22, 2014 at 10:53 AM, Rafael Fourquet fourquet.raf...@gmail.com wrote: We'd like to eventually be able to do stream fusion to make the vectorized version as efficient as the manually fused version, but for now there's a performance gap. It is also not too difficult to implement a fused version via iterators, eg: immutable iabs{X} x::X end Base.start(i::iabs) = start(i.x) Base.next(i::iabs, s) = ((v, s) = next(i.x, s); (abs(v), s)) Base.done(i::iabs, s) = done(i.x, s) Then sum(iabs(A)) is ways faster than sum(abs(A)) (but still slightly slower than sumabs(A)).
Re: [julia-users] why sum(abs(A)) is very slow
Could you please explain why the iterator version is so much faster? Is it simply from avoiding temporary array allocation? Thanks, --Peter On Friday, August 22, 2014 7:53:59 AM UTC-7, Rafael Fourquet wrote: We'd like to eventually be able to do stream fusion to make the vectorized version as efficient as the manually fused version, but for now there's a performance gap. It is also not too difficult to implement a fused version via iterators, eg: immutable iabs{X} x::X end Base.start(i::iabs) = start(i.x) Base.next(i::iabs, s) = ((v, s) = next(i.x, s); (abs(v), s)) Base.done(i::iabs, s) = done(i.x, s) Then sum(iabs(A)) is ways faster than sum(abs(A)) (but still slightly slower than sumabs(A)).
Re: [julia-users] why sum(abs(A)) is very slow
Obviously it would be even nicer not to have to do that :-) My naive answer is then why not make vectorized functions lazy (like iabs above, plus dimensions information) by default? Do you have links to relevant discussions?
Re: [julia-users] why sum(abs(A)) is very slow
On Fri, Aug 22, 2014 at 11:32 AM, Rafael Fourquet fourquet.raf...@gmail.com wrote: My naive answer is then why not make vectorized functions lazy (like iabs above, plus dimensions information) by default? Do you have links to relevant discussions? If that was the way things worked, would sum(abs(A)) do the computation right away or just wait until you ask for the result? In other words, should sum also be lazy if we're doing all vectorized computations that way? What about sum(abs(A),1)? Lazy or eager? What about A*B when A and B are matrices? Should that be an eager matrix product or just a lazy representation that hangs onto A and B and answers queries about their product on demand? If you're computing trace(A*B) then you can save a huge amount of work that way. But if you need all or most of the values in A*B then computing each one as a vector-vector product on demand is very inefficient.
Re: [julia-users] why sum(abs(A)) is very slow
I'm not familiar with lazy evaluation (I've not used any language implementing it). But I was wondering... Why not have a 'calculate_now' function to let the programmer choose when a result is guaranteed to be calculated? Otherwise, resort to lazy representations. There could be some heuristic also: if at least one of the original object is freed by the GC, perform all the calculations depending on it. That could also be simpler: defer actual calculations until the end of the current block.
Re: [julia-users] why sum(abs(A)) is very slow
If that was the way things worked, would sum(abs(A)) do the computation right away or just wait until you ask for the result? In other words, should sum also be lazy if we're doing all vectorized computations that way? sum(abs(A)) returns a scalar, so lazy would buy nothing here (in most cases at least, let's not be haskell!) What about sum(abs(A),1)? Lazy or eager? If dim A1, the result is an array so lazy. In short, be lazy when it gives opportunity for loop fusion, and saves allocations. What about A*B when A and B are matrices? I was more thinking of operations done element-wise (of the form of map(f, A1, ...), like abs and +). Optimizing a product A*B is less trivial (C++ expressions templates...), si I prefer not answer!
Re: [julia-users] why sum(abs(A)) is very slow
Could you please explain why the iterator version is so much faster? Is it simply from avoiding temporary array allocation? That's what I understand, and maybe marginally because there is only one pass over the data.
Re: [julia-users] why sum(abs(A)) is very slow
On Aug 22, 2014, at 1:45 PM, Rafael Fourquet fourquet.raf...@gmail.com wrote: In short, be lazy when it gives opportunity for loop fusion, and saves allocations. There's a complicated limit to when you want to fuse loops – at some point multiple iterations becomes better than fused loops and it all depends on how much and what kind of work you're doing. In general doing things lazily does not cut down on allocation since you have to allocate the representation of the operations that you're deferring and close over any values that they depend on. This particular example only works out so well because the iterable is so simple that the compiler can eliminate the laziness and do the eager loop fused version for you. This will not generally be the case. You're welcome to experiment (and Julia's type system makes it pretty easy to do so), but I think that you'll quickly find that more laziness is not a panacea for performance problems.
[julia-users] why sum(abs(A)) is very slow
A is a 1-dimensional array. I used to compute sum(abs(A)). But when I changed to the following, the speed increased nearly 10 fold. Why is that? sumA=0 for i=1:length(A) sumA = sumA + abs(A[i]) end