Octave 3.6 just gave up: octave:1> tic; sprand(700000, 700000, .001); toc; error: memory exhausted or requested size too large for range of Octave's index type -- trying to return to prompt
-viral On 30-Apr-2014, at 9:08 pm, Viral Shah <[email protected]> wrote: > You can call SparseMatrixCSC directly, but then you have to do all the > arrangement and sorting yourself. Depending on your application and how the > nonzeros are generated, this may or may not help. > > I will investigate this further. I now have all the information I need. > > Thanks, > > -viral > > > > On 30-Apr-2014, at 8:48 pm, Ryan Gardner <[email protected]> wrote: > >> I've got 16GB of RAM on this machine. Largely, my question, with admittedly >> little knowledge of the internal structure of the sparse arrays, is why >> generating the actual SparseMatrixCSC is so much slower than generating what >> is essentially another sparse matrix representation consisting of the >> indices and values. (I realize that once we start swapping, which will >> happen in my example, things slow down a ton, but even the sprand I mention >> was slow.) Do you observe the same results? Is the reason for the >> difference clear to someone else? >> >> Thanks for all the comments. These are helpful. It had not crossed my mind >> that I could control the data type of the indices. >> >> Using the SparseMatrixCSC constructor directly would probably be very >> helpful. Do you learn about that constructor from looking at source code or >> do you see it somewhere else? >> >> I'm also curious about where @inbounds was used. >> >> >> >> >> >> >> On Wed, Apr 30, 2014 at 8:59 AM, Tony Kelman <[email protected]> wrote: >> If you're assembling the matrix in row-sorted column-major order and there's >> no duplication, then you can also skip the conversion work by using the >> SparseMatrixCSC constructor directly. >> >> >> On Wednesday, April 30, 2014 1:10:31 AM UTC-7, Viral Shah wrote: >> Could you post your code? Will avoid me writing the same. :-) >> >> Was building the vectors taking all the time, or was it in building the >> sparse matrix from the triples? Triples to CSC conversion is an expensive >> operation, and we have spent a fair amount of time making it fast. Of >> course, there could be more opportunities at speeding the code. >> >> Where did you use @inbounds and @simd? >> >> -viral >> >> >> >> On 30-Apr-2014, at 1:11 pm, Dominique Orban <[email protected]> wrote: >> >>> Downgrading the 700,000 to 70,000 for the sake of not waiting all night, >>> the original implementation takes about 4.3 seconds on my laptop. >>> Preallocating arrays and using @inbounds brings it down to about 0.6 >>> seconds. @simd doesn't seem to provide any further speedup. Building the >>> sparse matrix takes about 3.8 seconds. This may be due to conversion from >>> triple to csc format?! >>> >>> ps: using the original size of 700,000, Julia reports a memory usage of >>> 11.8GB. >>> >>> >>> On Wednesday, April 30, 2014 12:26:02 AM UTC-7, Viral Shah wrote: >>> I believe the memory requirement should be 700000*700*16 (64-bit nonzeros >>> and row indices) + 700001*8 (64-bit column pointers) = 7.8 GB. >>> >>> This can be brought down a bit by using 32-bit index values and 64-bit >>> floats, but then you need 5.8 GB. Finally, if you use 32-bit index values >>> with 32-bit floats, you can come down to 4GB. The Julia sparse matrix >>> implementation is quite flexible and allows you to easily do such things. >>> >>> >>> julia> s = sparse(int32(1:10), int32(1:10), 1.0); >>> >>> julia> typeof(s) >>> SparseMatrixCSC{Float64,Int32} (constructor with 1 method) >>> >>> julia> s = sparse(int32(1:10), int32(1:10), float32(1.0)); >>> >>> julia> typeof(s) >>> SparseMatrixCSC{Float32,Int32} (constructor with 1 method) >>> >>> >>> -viral >>> >>> On Wednesday, April 30, 2014 12:36:17 PM UTC+5:30, Ivar Nesje wrote: >>> Sorry for pointing out a probably obvious problem, but as there are others >>> that might try debug this issue on their laptop, I ask how much memory do >>> you have? 700000*700 floats + indexes, will spend a minimum of 11 GB (if my >>> math is correct) and possibly more if the asymptotic storage requirement is >>> more than 2 Int64 + 1 Float64 per stored value. >>> >>> Ivar >>> >>> kl. 01:46:22 UTC+2 onsdag 30. april 2014 skrev Ryan Gardner følgende: >>> Creating sparse arrays seems exceptionally slow. >>> >>> I can set up the non-zero data of the array relatively quickly. For >>> example, the following code takes about 80 seconds on one machine. >>> >>> >>> vec_len = 700000 >>> >>> >>> row_ind = Uint64[] >>> col_ind = Uint64[] >>> value = Float64[] >>> >>> >>> for j = 1:700000 >>> for k = 1:700 >>> ind = k*50 >>> push!(row_ind, ind) >>> push!(col_ind, j) >>> push!(value, 5.0) >>> end >>> end >>> >>> >>> but then >>> >>> a = sparse(row_ind, col_ind, value, 700000, 700000) >>> >>> >>> takes more than at least about 30 minutes. (I never let it finish.) >>> >>> It doesn't seem like the numbers I'm using should be that far off the >>> scale. Is there a more efficient way I should be doing what I'm doing? Am >>> I missing something and asking for something that really is impractical? >>> >>> If not, I may be able to look into the sparse matrix code a little this >>> weekend. >>> >>> >>> The never-finishing result is the same if I try >>> >>> sprand(700000, 700000, .001) >>> >>> or if I try to set 700000*700 values in a sparse matrix of zeros directly. >>> Thanks. >>> >>> >> >> >
