Re: [julia-users] Re: Parallel sparse matrix vector multiplication

Madeleine Udell Mon, 10 Feb 2014 12:53:43 -0800

Thanks, Andreas. I've tried to follow your directions, but am still getting
Base.libblas_name = "libopenblas". Details follow:


I've created a file Make.mkl and replaced Make.inc with Make.mkl in the
Makefile. In Make.inc, I've set

USE_MKL=1

and I've set the MKLROOT for my install location,
export MKLROOT=/opt/intel/mkl

I run

make cleanall

with no problems, but when I try

make -C distclean-arpack
make -C distclean-suitesparse

where I get the error `make: *** distclean-arpack: No such file or
directory.  Stop.` Instead, I've run

make -C deps distclean-arpack
make -C deps distclean-suitesparse

(I'm not sure if that's doing what was intended). Now I recompile julia
(very quickly!)

make -j 60

but still see

julia> Base.libblas_name
"libopenblas"

Any ideas?

Thanks!
Madeleine


On Mon, Feb 10, 2014 at 12:08 PM, Andreas Noack Jensen <
[email protected]> wrote:

> Hi Madeleine
>
> When compiling julia with MKL you'll have to do make cleanall as well as
> rebuild ARPACK and Suitesparse, i.e. make -C distclean-arpack and make -C
> distclean-suitesparse. It is also easier to create a Make.user there you
> set USE_MKL=1 and MKLROOT to the location of your MKL library files, e.g.
> /opt/intel/mkl. The arguments are explained here
>
>
> http://software.intel.com/sites/products/documentation/hpc/mkl/mklman/GUID-C2EE93F0-B573-4538-A994-202CB3ADFFC2.htm#GUID-C2EE93F0-B573-4538-A994-202CB3ADFFC2
>
> but transa determines if the operation is Ax or A'x and matdescra has
> information about the structure of the matrix, e.g. if it is triangular.
> When you have succeeded in compiling julia with MKL, the libblas variable
> should just be Base.libblas_name.
>
>
> 2014-02-10 20:37 GMT+01:00 Madeleine Udell <[email protected]>:
>
> I'm having trouble using MKL in julia. I changed Make.inc so that USE_MKL
>> = 1,
>> but when I make and run julia, I find that Base.libblas_name =
>> "libopenblas". Is this expected? I would have thought it would be eg
>> "libmkl_core".
>>
>> Andreas, I found your wrappers for MKL in this 
>> PR<https://github.com/JuliaLang/julia/pull/4525/files>.
>> I've not used MKL before, so I don't understand the call signature of those
>> functions in order to call MKL directly. Any chance you could explain what
>> are transa::BlasChar and matdescra::ASCIIString in the following function,
>> and which mkl library is expected in the libblas variable? I see many .so
>> files in the lib/intel64 directory of my mkl installation, and I'm not sure
>> which one to point julia to.
>>
>> function cscmv!(transa::BlasChar, α::$T, matdescra::ASCIIString,
>> A::SparseMatrixCSC{$T, BlasInt}, x::StridedVector{$T}, β::$T,
>> y::StridedVector{$T})
>> length(x) == A.n || throw(DimensionMismatch("Matrix with $(A.n) columns
>> multiplied with vector of length $(length(x))"))
>>  length(y) == A.m || throw(DimensionMismatch("Vector of length $(A.m)
>> added to vector of length $(length(y))")) #
>> ccall(($(string(mv)), libblas), Void,
>>  (Ptr{Uint8}, Ptr{BlasInt}, Ptr{BlasInt}, Ptr{$T},
>>  Ptr{Uint8}, Ptr{$T}, Ptr{BlasInt}, Ptr{BlasInt},
>>  Ptr{BlasInt}, Ptr{$T}, Ptr{$T}, Ptr{$T}),
>> &transa, &A.m, &A.n, &α,
>> matdescra, A.nzval, A.rowval, pointer(A.colptr, 1),
>>  pointer(A.colptr, 2), x, &β, y)
>> return y
>> end
>>
>> Thanks for your help!
>> Madeleine
>>
>>
>> On Wed, Feb 5, 2014 at 1:49 PM, Andreas Noack Jensen <
>> [email protected]> wrote:
>>
>>> A*b will not call MKL when A is sparse. There has been some writing
>>> about making a MKL package that overwrites A_mul_B(Matrix,Vector) with the
>>> MKL versions and I actually wrote wrappers for the sparse MKL subroutines
>>> in the fall for the same reason.
>>>
>>>
>>> 2014-02-05 Madeleine Udell <[email protected]>:
>>>
>>> Miles, you're right that writing sparse matrix vector products in native
>>>> Julia probably won't be the best idea given Julia's model of parallelism.
>>>> That's why I'm interested in calling an outside library like PETSc.
>>>>
>>>> I see it's possible to link Julia with MKL. I haven't tried this yet,
>>>> but if I do, will A*b (where A is sparse) call MKL to perform the matrix
>>>> vector product?
>>>>
>>>>
>>>> On Wed, Feb 5, 2014 at 11:43 AM, Miles Lubin <[email protected]>wrote:
>>>>
>>>>> Memory access is typically a significant bottleneck in sparse mat-vec,
>>>>> so unfortunately I'm skeptical that one could achieve good performance
>>>>> using Julia's current distributed memory approach on a multicore machine.
>>>>> This really calls for something like OpenMP.
>>>>>
>>>>>
>>>>> On Wednesday, February 5, 2014 11:42:00 AM UTC-5, Madeleine Udell
>>>>> wrote:
>>>>>>
>>>>>> I'm developing an iterative optimization algorithm in Julia along the
>>>>>> lines of other contributions to the Iterative Solvers 
>>>>>> project<https://github.com/JuliaLang/IterativeSolvers.jl>or Krylov
>>>>>> Subspace
>>>>>> <https://github.com/JuliaLang/IterativeSolvers.jl/blob/master/src/krylov.jl>module
>>>>>>  whose
>>>>>> only computationally intensive step is computing A*b or A'*b. I would 
>>>>>> like
>>>>>> to parallelize the method by using a parallel sparse matrix vector
>>>>>> multiply. Is there a standard backend matrix-vector multiply that's
>>>>>> recommended in Julia if I'm targeting a shared memory computer with a 
>>>>>> large
>>>>>> number of processors? Similarly, is there a recommended backend for
>>>>>> targeting a cluster? My matrices can easily reach 10 million rows by 1
>>>>>> million columns, with sparsity anywhere from .01% to problems that are
>>>>>> nearly diagonal.
>>>>>>
>>>>>> I've seen many posts <https://github.com/JuliaLang/julia/issues/2645> 
>>>>>> talking
>>>>>> about integrating PETSc as a backend for this purpose, but it looks like
>>>>>> the 
>>>>>> project<https://github.com/petsc/petsc/blob/master/bin/julia/PETSc.jl>has
>>>>>>  stalled - the last commits I see are a year ago. I'm also interested in
>>>>>> other backends, eg Spark <http://spark.incubator.apache.org/>, 
>>>>>> SciDB<http://scidb.org/>,
>>>>>> etc.
>>>>>>
>>>>>> I'm more interested in solving sparse problems, but as a side note,
>>>>>> the built-in BLAS acceleration by changing the number of threads 
>>>>>> `blas_set_num_threads`
>>>>>> works ok for dense problems using a moderate number of processors. I 
>>>>>> wonder
>>>>>> why the number of threads isn't set higher than one by default, for
>>>>>> example, using as many as nprocs() cores?
>>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Madeleine Udell
>>>> PhD Candidate in Computational and Mathematical Engineering
>>>> Stanford University
>>>> www.stanford.edu/~udell
>>>>
>>>
>>>
>>>
>>> --
>>> Med venlig hilsen
>>>
>>> Andreas Noack Jensen
>>>
>>
>>
>>
>> --
>> Madeleine Udell
>> PhD Candidate in Computational and Mathematical Engineering
>> Stanford University
>> www.stanford.edu/~udell
>>
>
>
>
> --
> Med venlig hilsen
>
> Andreas Noack Jensen
>



-- 
Madeleine Udell
PhD Candidate in Computational and Mathematical Engineering
Stanford University
www.stanford.edu/~udell

Re: [julia-users] Re: Parallel sparse matrix vector multiplication

Reply via email to