A*b will not call MKL when A is sparse. There has been some writing about
making a MKL package that overwrites A_mul_B(Matrix,Vector) with the MKL
versions and I actually wrote wrappers for the sparse MKL subroutines in
the fall for the same reason.


2014-02-05 Madeleine Udell <[email protected]>:

> Miles, you're right that writing sparse matrix vector products in native
> Julia probably won't be the best idea given Julia's model of parallelism.
> That's why I'm interested in calling an outside library like PETSc.
>
> I see it's possible to link Julia with MKL. I haven't tried this yet, but
> if I do, will A*b (where A is sparse) call MKL to perform the matrix vector
> product?
>
>
> On Wed, Feb 5, 2014 at 11:43 AM, Miles Lubin <[email protected]>wrote:
>
>> Memory access is typically a significant bottleneck in sparse mat-vec, so
>> unfortunately I'm skeptical that one could achieve good performance using
>> Julia's current distributed memory approach on a multicore machine. This
>> really calls for something like OpenMP.
>>
>>
>> On Wednesday, February 5, 2014 11:42:00 AM UTC-5, Madeleine Udell wrote:
>>>
>>> I'm developing an iterative optimization algorithm in Julia along the
>>> lines of other contributions to the Iterative Solvers 
>>> project<https://github.com/JuliaLang/IterativeSolvers.jl>or Krylov
>>> Subspace
>>> <https://github.com/JuliaLang/IterativeSolvers.jl/blob/master/src/krylov.jl>module
>>>  whose
>>> only computationally intensive step is computing A*b or A'*b. I would like
>>> to parallelize the method by using a parallel sparse matrix vector
>>> multiply. Is there a standard backend matrix-vector multiply that's
>>> recommended in Julia if I'm targeting a shared memory computer with a large
>>> number of processors? Similarly, is there a recommended backend for
>>> targeting a cluster? My matrices can easily reach 10 million rows by 1
>>> million columns, with sparsity anywhere from .01% to problems that are
>>> nearly diagonal.
>>>
>>> I've seen many posts <https://github.com/JuliaLang/julia/issues/2645> 
>>> talking
>>> about integrating PETSc as a backend for this purpose, but it looks like
>>> the 
>>> project<https://github.com/petsc/petsc/blob/master/bin/julia/PETSc.jl>has 
>>> stalled - the last commits I see are a year ago. I'm also interested in
>>> other backends, eg Spark <http://spark.incubator.apache.org/>, 
>>> SciDB<http://scidb.org/>,
>>> etc.
>>>
>>> I'm more interested in solving sparse problems, but as a side note, the
>>> built-in BLAS acceleration by changing the number of threads 
>>> `blas_set_num_threads`
>>> works ok for dense problems using a moderate number of processors. I wonder
>>> why the number of threads isn't set higher than one by default, for
>>> example, using as many as nprocs() cores?
>>>
>>
>
>
> --
> Madeleine Udell
> PhD Candidate in Computational and Mathematical Engineering
> Stanford University
> www.stanford.edu/~udell
>



-- 
Med venlig hilsen

Andreas Noack Jensen

Reply via email to