I'm developing an iterative optimization algorithm in Julia along the lines 
of other contributions to the Iterative Solvers 
project<https://github.com/JuliaLang/IterativeSolvers.jl>or Krylov 
Subspace 
<https://github.com/JuliaLang/IterativeSolvers.jl/blob/master/src/krylov.jl>module
 whose 
only computationally intensive step is computing A*b or A'*b. I would like 
to parallelize the method by using a parallel sparse matrix vector 
multiply. Is there a standard backend matrix-vector multiply that's 
recommended in Julia if I'm targeting a shared memory computer with a large 
number of processors? Similarly, is there a recommended backend for 
targeting a cluster? My matrices can easily reach 10 million rows by 1 
million columns, with sparsity anywhere from .01% to problems that are 
nearly diagonal.

I've seen many posts <https://github.com/JuliaLang/julia/issues/2645> talking 
about integrating PETSc as a backend for this purpose, but it looks like 
the project <https://github.com/petsc/petsc/blob/master/bin/julia/PETSc.jl>has 
stalled - the last commits I see are a year ago. I'm also interested in 
other backends, eg Spark <http://spark.incubator.apache.org/>, 
SciDB<http://scidb.org/>, 
etc. 

I'm more interested in solving sparse problems, but as a side note, the 
built-in BLAS acceleration by changing the number of threads 
`blas_set_num_threads` 
works ok for dense problems using a moderate number of processors. I wonder 
why the number of threads isn't set higher than one by default, for 
example, using as many as nprocs() cores?

Reply via email to