On Sun, May 17, 2015 at 11:28 AM, Mohammed El-Beltagy
<[email protected]> wrote:
> Today while trying optimize a piece code I came across a rather curious
> behavior of when allocation memory when accessing a DataArray.
>
> x=rand(1:10,1000000);
> function countGT(x::Array{Int,1})

Since the algorithm is the same for both types, I think you don't need
the type assert here. Julia will automatically specialize on the type
you pass in.

>     count=0
>     for i=1:length(x)
>       count+= (x[i]>5)? 1: 0

add `@inbounds` here will improve the performance for `Array`. Not
sure if it can help with `DataArray` yet though.

>     end
>     count
> end
>
> Here is what you get after running @time (compilation excluded)
>
> @time countGT(x);
> elapsed time: 0.00847156 seconds (96 bytes allocated)
>
> That is not too bad. @time at least allocated 80 bytes and the extra 16
> bytes is for creating the variable "count", so far so good.
> Now lets see if we do the same a floating point array.
> x=rand(1000000);
> function countGT(x::Array{Float64,1})
>     count=0.0
>     for i=1:length(x)
>       count+= (x[i]>5.0)? 1.0: 0.0
>     end
>     count
> end
>
> countGT(x)
> @time countGT(x)
>
> You get
> elapsed time: 0.00177126 seconds (96 bytes allocated)
> Which still pretty good. Now, the problem start to show up when I have a
> DataArray
> x=@data rand(1000000);
> function countGT(x::DataArray{Float64,1})
>     count=0.0
>     for i=1:length(x)
>       count+= (x[i]>5.0)? 1.0: 0.0
>     end
>     count
> end

`getindex` of DataArray appears to be not type stable. It returns
either `NAType` or the data type. I think this is probably the reason
for the allocation.

>
> countGT(x)
> @time countGT(x)
>
> You we get
> elapsed time: 0.23610454 seconds (16000096 bytes allocated)
>
> The bytes allocated seems to scale with the size of the DataArray. So it
> seems that mere act of accessing an element in a DataArray allocates memory.
>
> I am wondering there could be a better way.
>
>

I'm not familiar with DataArrays and it's API but I would guess it can
use Nullable or sth similar.

>
>
>
>
>

Reply via email to