Re: [julia-users] big matrices in Julia

Matt Bauman Fri, 14 Aug 2015 09:09:50 -0700

You could create a "phony" 2-dimensional array that computes the distances 
on the fly… but you won't be able to pass this matrix to, e.g., BLAS.


immutable DistanceMatrix <: AbstractArray{Float64, 2}
    locs::Array{Float64, 2} # a 2xN or 3xN matrix containing the location 
coordinates
end
Base.size(A::DistanceMatrix) = (length(A.locs), length(A.locs))
Base.getindex(A::DistanceMatrix, i::Int, j::Int) = dist(A.locs[:, i], A.locs
[:, j]) # could be further optimized

(Untested)

On Friday, August 14, 2015 at 11:03:06 AM UTC-4, Stefan Karpinski wrote:
>
> On Friday, August 14, 2015, Charles Novaes de Santana <
> [email protected] <javascript:>> wrote:
>
>>
>> 1) to use only the subset of suitable habitats to build the matrix of 
>> distances (and then to use sparse matrix as suggested by Stefan)
>>
>
> Distance matrices are not usually sparse – since the farthest apart pairs 
> of points have large distances and are the most common and least 
> interesting. However, you could store only distances for close points in a 
> sparse matrix and use zero to represent the distance between pairs of 
> points that are not close enough to be of interest. Either that or you 
> could store 1/d instead of d and then closer points have higher weights and 
> you can threshold 1/distance so that far apart points have zero entries.
>  
>
>> 2) to use a machine with more memory and try to run my models using the 
>> matrices with all the sites
>>
>
> This is probably the easiest thing to do since your data set is not of a 
> truly unreasonable size, just largish. However, you may be much happier if 
> you can make your problem smaller than O(n^2).
>  
>
>> 3) to try another language/library that might work better with such big 
>> amount of data (like python, or R).
>>
>
> This problem isn't going to be fundamentally different no matter what 
> language you use: you have more data than fits in memory. Spilling memory 
> to disk is going to be *much* slower than just recomputing distances – 
> orders of magnitude slower. As John suggested, is there any particular 
> reason you need to materialize all of these values in a matrix? What 
> computation are you going to perform over that matrix?
>

Re: [julia-users] big matrices in Julia

Reply via email to