Re: [julia-users] efficiency of sparse array creation

Ryan Gardner Wed, 30 Apr 2014 12:31:12 -0700

Hmmm.  That is much better than I was getting.  Thanks Viral.

Was it much faster for you to create the column-index, row-index, and value
arrays?  I would still expect them to be roughly on par in terms of speed.



On Wed, Apr 30, 2014 at 2:36 PM, Viral Shah <[email protected]> wrote:

> I ran the sprand example, and it took 290 seconds on a machine with enough
> RAM. Given that it is creating a matrix with half a billion nonzeros, this
> doesn’t sound too bad.
>
> -viral
>
>
>
> On 30-Apr-2014, at 8:48 pm, Ryan Gardner <[email protected]> wrote:
>
> > I've got 16GB of RAM on this machine.  Largely, my question, with
> admittedly little knowledge of the internal structure of the sparse arrays,
> is why generating the actual SparseMatrixCSC is so much slower than
> generating what is essentially another sparse matrix representation
> consisting of the indices and values.  (I realize that once we start
> swapping, which will happen in my example, things slow down a ton, but even
> the sprand I mention was slow.)  Do you observe the same results?  Is the
> reason for the difference clear to someone else?
> >
> > Thanks for all the comments.  These are helpful.  It had not crossed my
> mind that I could control the data type of the indices.
> >
> > Using the SparseMatrixCSC constructor directly would probably be very
> helpful.  Do you learn about that constructor from looking at source code
> or do you see it somewhere else?
> >
> > I'm also curious about where @inbounds was used.
> >
> >
> >
> >
> >
> >
> > On Wed, Apr 30, 2014 at 8:59 AM, Tony Kelman <[email protected]> wrote:
> > If you're assembling the matrix in row-sorted column-major order and
> there's no duplication, then you can also skip the conversion work by using
> the SparseMatrixCSC constructor directly.
> >
> >
> > On Wednesday, April 30, 2014 1:10:31 AM UTC-7, Viral Shah wrote:
> > Could you post your code? Will avoid me writing the same. :-)
> >
> > Was building the vectors taking all the time, or was it in building the
> sparse matrix from the triples? Triples to CSC conversion is an expensive
> operation, and we have spent a fair amount of time making it fast. Of
> course, there could be more opportunities at speeding the code.
> >
> > Where did you use @inbounds and @simd?
> >
> > -viral
> >
> >
> >
> > On 30-Apr-2014, at 1:11 pm, Dominique Orban <[email protected]>
> wrote:
> >
> > > Downgrading the 700,000 to 70,000 for the sake of not waiting all
> night, the original implementation takes about 4.3 seconds on my laptop.
> Preallocating arrays and using @inbounds brings it down to about 0.6
> seconds. @simd doesn't seem to provide any further speedup. Building the
> sparse matrix takes about 3.8 seconds. This may be due to conversion from
> triple to csc format?!
> > >
> > > ps: using the original size of 700,000, Julia reports a memory usage
> of 11.8GB.
> > >
> > >
> > > On Wednesday, April 30, 2014 12:26:02 AM UTC-7, Viral Shah wrote:
> > > I believe the memory requirement should be 700000*700*16 (64-bit
> nonzeros and row indices) + 700001*8 (64-bit column pointers) = 7.8 GB.
> > >
> > > This can be brought down a bit by using 32-bit index values and 64-bit
> floats, but then you need 5.8 GB. Finally, if you use 32-bit index values
> with 32-bit floats, you can come down to 4GB. The Julia sparse matrix
> implementation is quite flexible and allows you to easily do such things.
> > >
> > >
> > > julia> s = sparse(int32(1:10), int32(1:10), 1.0);
> > >
> > > julia> typeof(s)
> > > SparseMatrixCSC{Float64,Int32} (constructor with 1 method)
> > >
> > > julia> s = sparse(int32(1:10), int32(1:10), float32(1.0));
> > >
> > > julia> typeof(s)
> > > SparseMatrixCSC{Float32,Int32} (constructor with 1 method)
> > >
> > >
> > > -viral
> > >
> > > On Wednesday, April 30, 2014 12:36:17 PM UTC+5:30, Ivar Nesje wrote:
> > > Sorry for pointing out a probably obvious problem, but as there are
> others that might try debug this issue on their laptop, I ask how much
> memory do you have? 700000*700 floats + indexes, will spend a minimum of 11
> GB (if my math is correct) and possibly more if the asymptotic storage
> requirement is more than 2 Int64 + 1 Float64 per stored value.
> > >
> > > Ivar
> > >
> > > kl. 01:46:22 UTC+2 onsdag 30. april 2014 skrev Ryan Gardner følgende:
> > > Creating sparse arrays seems exceptionally slow.
> > >
> > > I can set up the non-zero data of the array relatively quickly.  For
> example, the following code takes about 80 seconds on one machine.
> > >
> > >
> > > vec_len = 700000
> > >
> > >
> > > row_ind = Uint64[]
> > > col_ind = Uint64[]
> > > value = Float64[]
> > >
> > >
> > > for j = 1:700000
> > >    for k = 1:700
> > >       ind = k*50
> > >       push!(row_ind, ind)
> > >       push!(col_ind, j)
> > >       push!(value, 5.0)
> > >    end
> > > end
> > >
> > >
> > > but then
> > >
> > > a = sparse(row_ind, col_ind, value, 700000, 700000)
> > >
> > >
> > > takes more than at least about 30 minutes.  (I never let it finish.)
> > >
> > > It doesn't seem like the numbers I'm using should be that far off the
> scale.  Is there a more efficient way I should be doing what I'm doing?  Am
> I missing something and asking for something that really is impractical?
> > >
> > > If not, I may be able to look into the sparse matrix code a little
> this weekend.
> > >
> > >
> > > The never-finishing result is the same if I try
> > >
> > > sprand(700000, 700000, .001)
> > >
> > > or if I try to set 700000*700 values in a sparse matrix of zeros
> directly.  Thanks.
> > >
> > >
> >
> >
>
>

Re: [julia-users] efficiency of sparse array creation

Reply via email to