Thanks, Andreas. I've tried to follow your directions, but am still getting Base.libblas_name = "libopenblas". Details follow:
I've created a file Make.mkl and replaced Make.inc with Make.mkl in the Makefile. In Make.inc, I've set USE_MKL=1 and I've set the MKLROOT for my install location, export MKLROOT=/opt/intel/mkl I run make cleanall with no problems, but when I try make -C distclean-arpack make -C distclean-suitesparse where I get the error `make: *** distclean-arpack: No such file or directory. Stop.` Instead, I've run make -C deps distclean-arpack make -C deps distclean-suitesparse (I'm not sure if that's doing what was intended). Now I recompile julia (very quickly!) make -j 60 but still see julia> Base.libblas_name "libopenblas" Any ideas? Thanks! Madeleine On Mon, Feb 10, 2014 at 12:08 PM, Andreas Noack Jensen < [email protected]> wrote: > Hi Madeleine > > When compiling julia with MKL you'll have to do make cleanall as well as > rebuild ARPACK and Suitesparse, i.e. make -C distclean-arpack and make -C > distclean-suitesparse. It is also easier to create a Make.user there you > set USE_MKL=1 and MKLROOT to the location of your MKL library files, e.g. > /opt/intel/mkl. The arguments are explained here > > > http://software.intel.com/sites/products/documentation/hpc/mkl/mklman/GUID-C2EE93F0-B573-4538-A994-202CB3ADFFC2.htm#GUID-C2EE93F0-B573-4538-A994-202CB3ADFFC2 > > but transa determines if the operation is Ax or A'x and matdescra has > information about the structure of the matrix, e.g. if it is triangular. > When you have succeeded in compiling julia with MKL, the libblas variable > should just be Base.libblas_name. > > > 2014-02-10 20:37 GMT+01:00 Madeleine Udell <[email protected]>: > > I'm having trouble using MKL in julia. I changed Make.inc so that USE_MKL >> = 1, >> but when I make and run julia, I find that Base.libblas_name = >> "libopenblas". Is this expected? I would have thought it would be eg >> "libmkl_core". >> >> Andreas, I found your wrappers for MKL in this >> PR<https://github.com/JuliaLang/julia/pull/4525/files>. >> I've not used MKL before, so I don't understand the call signature of those >> functions in order to call MKL directly. Any chance you could explain what >> are transa::BlasChar and matdescra::ASCIIString in the following function, >> and which mkl library is expected in the libblas variable? I see many .so >> files in the lib/intel64 directory of my mkl installation, and I'm not sure >> which one to point julia to. >> >> function cscmv!(transa::BlasChar, α::$T, matdescra::ASCIIString, >> A::SparseMatrixCSC{$T, BlasInt}, x::StridedVector{$T}, β::$T, >> y::StridedVector{$T}) >> length(x) == A.n || throw(DimensionMismatch("Matrix with $(A.n) columns >> multiplied with vector of length $(length(x))")) >> length(y) == A.m || throw(DimensionMismatch("Vector of length $(A.m) >> added to vector of length $(length(y))")) # >> ccall(($(string(mv)), libblas), Void, >> (Ptr{Uint8}, Ptr{BlasInt}, Ptr{BlasInt}, Ptr{$T}, >> Ptr{Uint8}, Ptr{$T}, Ptr{BlasInt}, Ptr{BlasInt}, >> Ptr{BlasInt}, Ptr{$T}, Ptr{$T}, Ptr{$T}), >> &transa, &A.m, &A.n, &α, >> matdescra, A.nzval, A.rowval, pointer(A.colptr, 1), >> pointer(A.colptr, 2), x, &β, y) >> return y >> end >> >> Thanks for your help! >> Madeleine >> >> >> On Wed, Feb 5, 2014 at 1:49 PM, Andreas Noack Jensen < >> [email protected]> wrote: >> >>> A*b will not call MKL when A is sparse. There has been some writing >>> about making a MKL package that overwrites A_mul_B(Matrix,Vector) with the >>> MKL versions and I actually wrote wrappers for the sparse MKL subroutines >>> in the fall for the same reason. >>> >>> >>> 2014-02-05 Madeleine Udell <[email protected]>: >>> >>> Miles, you're right that writing sparse matrix vector products in native >>>> Julia probably won't be the best idea given Julia's model of parallelism. >>>> That's why I'm interested in calling an outside library like PETSc. >>>> >>>> I see it's possible to link Julia with MKL. I haven't tried this yet, >>>> but if I do, will A*b (where A is sparse) call MKL to perform the matrix >>>> vector product? >>>> >>>> >>>> On Wed, Feb 5, 2014 at 11:43 AM, Miles Lubin <[email protected]>wrote: >>>> >>>>> Memory access is typically a significant bottleneck in sparse mat-vec, >>>>> so unfortunately I'm skeptical that one could achieve good performance >>>>> using Julia's current distributed memory approach on a multicore machine. >>>>> This really calls for something like OpenMP. >>>>> >>>>> >>>>> On Wednesday, February 5, 2014 11:42:00 AM UTC-5, Madeleine Udell >>>>> wrote: >>>>>> >>>>>> I'm developing an iterative optimization algorithm in Julia along the >>>>>> lines of other contributions to the Iterative Solvers >>>>>> project<https://github.com/JuliaLang/IterativeSolvers.jl>or Krylov >>>>>> Subspace >>>>>> <https://github.com/JuliaLang/IterativeSolvers.jl/blob/master/src/krylov.jl>module >>>>>> whose >>>>>> only computationally intensive step is computing A*b or A'*b. I would >>>>>> like >>>>>> to parallelize the method by using a parallel sparse matrix vector >>>>>> multiply. Is there a standard backend matrix-vector multiply that's >>>>>> recommended in Julia if I'm targeting a shared memory computer with a >>>>>> large >>>>>> number of processors? Similarly, is there a recommended backend for >>>>>> targeting a cluster? My matrices can easily reach 10 million rows by 1 >>>>>> million columns, with sparsity anywhere from .01% to problems that are >>>>>> nearly diagonal. >>>>>> >>>>>> I've seen many posts <https://github.com/JuliaLang/julia/issues/2645> >>>>>> talking >>>>>> about integrating PETSc as a backend for this purpose, but it looks like >>>>>> the >>>>>> project<https://github.com/petsc/petsc/blob/master/bin/julia/PETSc.jl>has >>>>>> stalled - the last commits I see are a year ago. I'm also interested in >>>>>> other backends, eg Spark <http://spark.incubator.apache.org/>, >>>>>> SciDB<http://scidb.org/>, >>>>>> etc. >>>>>> >>>>>> I'm more interested in solving sparse problems, but as a side note, >>>>>> the built-in BLAS acceleration by changing the number of threads >>>>>> `blas_set_num_threads` >>>>>> works ok for dense problems using a moderate number of processors. I >>>>>> wonder >>>>>> why the number of threads isn't set higher than one by default, for >>>>>> example, using as many as nprocs() cores? >>>>>> >>>>> >>>> >>>> >>>> -- >>>> Madeleine Udell >>>> PhD Candidate in Computational and Mathematical Engineering >>>> Stanford University >>>> www.stanford.edu/~udell >>>> >>> >>> >>> >>> -- >>> Med venlig hilsen >>> >>> Andreas Noack Jensen >>> >> >> >> >> -- >> Madeleine Udell >> PhD Candidate in Computational and Mathematical Engineering >> Stanford University >> www.stanford.edu/~udell >> > > > > -- > Med venlig hilsen > > Andreas Noack Jensen > -- Madeleine Udell PhD Candidate in Computational and Mathematical Engineering Stanford University www.stanford.edu/~udell
