[julia-users] Re: efficiency of sparse array creation

Dominique Orban Wed, 30 Apr 2014 00:42:13 -0700

Downgrading the 700,000 to 70,000 for the sake of not waiting all night, 
the original implementation takes about 4.3 seconds on my laptop. 
Preallocating arrays and using @inbounds brings it down to about 0.6 
seconds. @simd doesn't seem to provide any further speedup. Building the 
sparse matrix takes about 3.8 seconds. This may be due to conversion from 
triple to csc format?!


ps: using the original size of 700,000, Julia reports a memory usage of 
11.8GB.


On Wednesday, April 30, 2014 12:26:02 AM UTC-7, Viral Shah wrote:
>
> I believe the memory requirement should be 700000*700*16 (64-bit nonzeros 
> and row indices) + 700001*8 (64-bit column pointers) = 7.8 GB.
>
> This can be brought down a bit by using 32-bit index values and 64-bit 
> floats, but then you need 5.8 GB. Finally, if you use 32-bit index values 
> with 32-bit floats, you can come down to 4GB. The Julia sparse matrix 
> implementation is quite flexible and allows you to easily do such things.
>
>
> julia> s = sparse(int32(1:10), int32(1:10), 1.0);
>
>
> julia> typeof(s)
>
> SparseMatrixCSC{Float64,Int32} (constructor with 1 method)
>
>
> julia> s = sparse(int32(1:10), int32(1:10), float32(1.0));
>
>
> julia> typeof(s)
>
> SparseMatrixCSC{Float32,Int32} (constructor with 1 method)
>
>
> -viral
>
> On Wednesday, April 30, 2014 12:36:17 PM UTC+5:30, Ivar Nesje wrote:
>>
>> Sorry for pointing out a probably obvious problem, but as there are 
>> others that might try debug this issue on their laptop, I ask how much 
>> memory do you have? 700000*700 floats + indexes, will spend a minimum of 11 
>> GB (if my math is correct) and possibly more if the asymptotic storage 
>> requirement is more than 2 Int64 + 1 Float64 per stored value.
>>
>> Ivar
>>
>> kl. 01:46:22 UTC+2 onsdag 30. april 2014 skrev Ryan Gardner følgende:
>>>
>>> Creating sparse arrays seems exceptionally slow.
>>>
>>> I can set up the non-zero data of the array relatively quickly.  For 
>>> example, the following code takes about 80 seconds on one machine.
>>>
>>>
>>> vec_len = 700000
>>>
>>>
>>> row_ind = Uint64[]
>>> col_ind = Uint64[]
>>> value = Float64[]
>>>
>>>
>>> for j = 1:700000
>>>    for k = 1:700
>>>       ind = k*50
>>>       push!(row_ind, ind)
>>>       push!(col_ind, j)
>>>       push!(value, 5.0)
>>>    end
>>> end
>>>
>>>
>>> but then
>>>
>>> a = sparse(row_ind, col_ind, value, 700000, 700000)
>>>
>>>
>>> takes more than at least about 30 minutes.  (I never let it finish.)
>>>
>>> It doesn't seem like the numbers I'm using should be that far off the 
>>> scale.  Is there a more efficient way I should be doing what I'm doing?  Am 
>>> I missing something and asking for something that really is impractical?
>>>
>>> If not, I may be able to look into the sparse matrix code a little this 
>>> weekend.
>>>
>>>
>>> The never-finishing result is the same if I try
>>>
>>> sprand(700000, 700000, .001)
>>>
>>> or if I try to set 700000*700 values in a sparse matrix of zeros 
>>> directly.  Thanks.
>>>
>>>
>>>

[julia-users] Re: efficiency of sparse array creation

Reply via email to