Could you please explain why the iterator version is so much faster?  Is it 
simply from avoiding temporary array allocation?

Thanks,
--Peter

On Friday, August 22, 2014 7:53:59 AM UTC-7, Rafael Fourquet wrote:
>
> We'd like to eventually be able to do stream fusion to make the vectorized 
>> version as efficient as the manually fused version, but for now there's a 
>> performance gap. 
>>
>
> It is also not too difficult to implement a fused version via iterators, 
> eg: 
>
> immutable iabs{X}
>     x::X
> end
>
> Base.start(i::iabs) = start(i.x)
> Base.next(i::iabs, s) = ((v, s) = next(i.x, s); (abs(v), s))
> Base.done(i::iabs, s) = done(i.x, s)
>
> Then sum(iabs(A)) is ways faster than sum(abs(A)) (but still slightly 
> slower than sumabs(A)).
>
>

Reply via email to