If A is not a global variable (i.e within a function), @devec would be much
faster (comparable to sumabs)
Dahua
On Monday, August 25, 2014 4:26:22 AM UTC+8, Adam Smith wrote:
I've run into this a few times (and a few hundred times in python), so I
made an @iterize macro. Not sure how
There's a complicated limit to when you want to fuse loops – at some point
multiple iterations becomes better than fused loops and it all depends on
how much and what kind of work you're doing. In general doing things lazily
does not cut down on allocation since you have to allocate the
(I was also thinking about element-wise operations)
There is a sumabs function in Base for this reason. We'd like to eventually
be able to do stream fusion to make the vectorized version as efficient as
the manually fused version, but for now there's a performance gap. Note
that the vectorized version is the same speed you would get in other
Yes, that works nicely. Obviously it would be even nicer not to have to do
that :-)
On Fri, Aug 22, 2014 at 10:53 AM, Rafael Fourquet fourquet.raf...@gmail.com
wrote:
We'd like to eventually be able to do stream fusion to make the vectorized
version as efficient as the manually fused
Could you please explain why the iterator version is so much faster? Is it
simply from avoiding temporary array allocation?
Thanks,
--Peter
On Friday, August 22, 2014 7:53:59 AM UTC-7, Rafael Fourquet wrote:
We'd like to eventually be able to do stream fusion to make the vectorized
version
Obviously it would be even nicer not to have to do that :-)
My naive answer is then why not make vectorized functions lazy (like iabs
above, plus dimensions information) by default? Do you have links to
relevant discussions?
On Fri, Aug 22, 2014 at 11:32 AM, Rafael Fourquet fourquet.raf...@gmail.com
wrote:
My naive answer is then why not make vectorized functions lazy (like iabs
above, plus dimensions information) by default? Do you have links to
relevant discussions?
If that was the way things worked, would
I'm not familiar with lazy evaluation (I've not used any language implementing
it). But I was wondering...
Why not have a 'calculate_now' function to let the programmer choose when a
result is guaranteed to be calculated? Otherwise, resort to lazy
representations.
There could be some
If that was the way things worked, would sum(abs(A)) do the computation
right away or just wait until you ask for the result? In other words,
should sum also be lazy if we're doing all vectorized computations that
way?
sum(abs(A)) returns a scalar, so lazy would buy nothing here (in most
Could you please explain why the iterator version is so much faster? Is
it simply from avoiding temporary array allocation?
That's what I understand, and maybe marginally because there is only one
pass over the data.
On Aug 22, 2014, at 1:45 PM, Rafael Fourquet fourquet.raf...@gmail.com wrote:
In short, be lazy when it gives opportunity for loop fusion, and saves
allocations.
There's a complicated limit to when you want to fuse loops – at some point
multiple iterations becomes better than fused loops
A is a 1-dimensional array. I used to compute sum(abs(A)). But when I
changed to the following, the speed increased nearly 10 fold. Why is that?
sumA=0
for i=1:length(A)
sumA = sumA + abs(A[i])
end
13 matches
Mail list logo