Hey Madeleine, First I would check that your global environment variables<http://software.intel.com/sites/products/documentation/hpc/mkl/mkl_userguide_lnx/GUID-0DE6A77B-00E0-4ED6-9CAE-52FCF49E5623.htm>are set up correctly. If you want to set up the number of threads programmatically:
_ _ _(_)_ | A fresh approach to technical computing (_) | (_) (_) | Documentation: http://docs.julialang.org _ _ _| |_ __ _ | Type "help()" to list help topics | | | | | | |/ _` | | | | |_| | | | (_| | | Version 0.3.0-prerelease+1470 (2014-02-08 16:23 UTC) _/ |\__'_|_|_|\__'_| | Commit 596d5c4* (3 days old master) |__/ | x86_64-unknown-linux-gnu julia> Base.blas_vendor() :mkl julia> Base.blas_set_num_threads(12) You can see the relevant code here: https://github.com/JuliaLang/julia/blob/8ac5a7f7fff1c54a768c7bc9ae85cf053d310f42/base/util.jl#L293 . It's always worth a quick search through the code base to figure out what is going on behind the scenes. Hope that helps! Jake On Tuesday, February 11, 2014 12:34:56 AM UTC-5, Madeleine Udell wrote: > > Jake, thanks for the > reference<http://software.intel.com/en-us/forums/topic/294954>; > I have 32 hyperthreaded cores, so there's definitely something else going > on to limit me to 6 in addition to not exploiting hyperthreading. > > Perhaps I need to call something like omp_set_num_threads()? But there > doesn't seem to be a function by this name in the libmkl_rt library. > > julia> > ccall((:omp_set_num_threads,Base.libblas_name),Ptr{Void},(Uint8,),32) > ERROR: ccall: could not find function omp_set_num_threads in library > libmkl_rt > in anonymous at no file > > > On Mon, Feb 10, 2014 at 4:05 PM, Jake Bolewski > <[email protected]<javascript:> > > wrote: > >> Are those hyper-threaded "cores"? If so, check Intel MKL's documentation >> on hyper-threading. >> >> -Best >> Jake >> >> On Monday, February 10, 2014 6:38:50 PM UTC-5, Madeleine Udell wrote: >> >>> It looks like only 6 threads are being used when I use mkl from julia. >>> If I do blas_set_num_threads(n), then using top, I see julia is >>> running at min(n,6)*100% cpu. Any idea why this would be, or how to >>> get mkl to use more threads? I'm not sure if this is an issue in julia >>> or with my installation of mkl. >>> >>> On Mon, Feb 10, 2014 at 2:09 PM, Andreas Noack Jensen >>> <[email protected]> wrote: >>> > You are welcome. Good to hear that it worked. >>> > >>> > >>> > 2014-02-10 22:35 GMT+01:00 Madeleine Udell <[email protected]>: >>> > >>> >> fantastic, thank you. I now see Base.libblas_name="libmkl_rt", and am >>> able >>> >> to run the following test successfully: >>> >> >>> >> transa = 'N'::Base.LinAlg.BlasChar # multiply by A, not A' >>> >> matdescra = "GXXF" # G = general, X = ignored, F = one-based indexing >>> >> m,n = 50,100 >>> >> A = sprand(m,n,.01) >>> >> y = zeros(m) >>> >> x = rand(n) >>> >> alpha = 1.0 >>> >> beta = 1.0 >>> >> >>> >> Base.LinAlg.SparseBLAS.cscmv!(transa,alpha,matdescra,A,x,beta,y) >>> >> y_builtin = A*x >>> >> >>> >> julia> y==y_builtin >>> >> true >>> >> >>> >> >>> >> On Mon, Feb 10, 2014 at 12:08 PM, Andreas Noack Jensen >>> >> <[email protected]> wrote: >>> >>> >>> >>> Hi Madeleine >>> >>> >>> >>> When compiling julia with MKL you'll have to do make cleanall as >>> well as >>> >>> rebuild ARPACK and Suitesparse, i.e. make -C distclean-arpack and >>> make -C >>> >>> distclean-suitesparse. It is also easier to create a Make.user there >>> you set >>> >>> USE_MKL=1 and MKLROOT to the location of your MKL library files, >>> e.g. >>> >>> /opt/intel/mkl. The arguments are explained here >>> >>> >>> >>> >>> >>> http://software.intel.com/sites/products/documentation/ >>> hpc/mkl/mklman/GUID-C2EE93F0-B573-4538-A994-202CB3ADFFC2. >>> htm#GUID-C2EE93F0-B573-4538-A994-202CB3ADFFC2 >>> >>> >>> >>> but transa determines if the operation is Ax or A'x and matdescra >>> has >>> >>> information about the structure of the matrix, e.g. if it is >>> triangular. >>> >>> When you have succeeded in compiling julia with MKL, the libblas >>> variable >>> >>> should just be Base.libblas_name. >>> >>> >>> >>> >>> >>> 2014-02-10 20:37 GMT+01:00 Madeleine Udell <[email protected]>: >>> >>> >>> >>>> I'm having trouble using MKL in julia. I changed Make.inc so that >>> >>>> USE_MKL = 1, >>> >>>> but when I make and run julia, I find that Base.libblas_name = >>> >>>> "libopenblas". Is this expected? I would have thought it would be >>> eg >>> >>>> "libmkl_core". >>> >>>> >>> >>>> Andreas, I found your wrappers for MKL in this PR. I've not used >>> MKL >>> >>>> before, so I don't understand the call signature of those functions >>> in order >>> >>>> to call MKL directly. Any chance you could explain what are >>> transa::BlasChar >>> >>>> and matdescra::ASCIIString in the following function, and which mkl >>> library >>> >>>> is expected in the libblas variable? I see many .so files in the >>> lib/intel64 >>> >>>> directory of my mkl installation, and I'm not sure which one to >>> point julia >>> >>>> to. >>> >>>> >>> >>>> function cscmv!(transa::BlasChar, α::$T, matdescra::ASCIIString, >>> >>>> A::SparseMatrixCSC{$T, BlasInt}, x::StridedVector{$T}, β::$T, >>> >>>> y::StridedVector{$T}) >>> >>>> length(x) == A.n || throw(DimensionMismatch("Matrix with $(A.n) >>> columns >>> >>>> multiplied with vector of length $(length(x))")) >>> >>>> length(y) == A.m || throw(DimensionMismatch("Vector of length >>> $(A.m) >>> >>>> added to vector of length $(length(y))")) # >>> >>>> ccall(($(string(mv)), libblas), Void, >>> >>>> (Ptr{Uint8}, Ptr{BlasInt}, Ptr{BlasInt}, Ptr{$T}, >>> >>>> Ptr{Uint8}, Ptr{$T}, Ptr{BlasInt}, Ptr{BlasInt}, >>> >>>> Ptr{BlasInt}, Ptr{$T}, Ptr{$T}, Ptr{$T}), >>> >>>> &transa, &A.m, &A.n, &α, >>> >>>> matdescra, A.nzval, A.rowval, pointer(A.colptr, 1), >>> >>>> pointer(A.colptr, 2), x, &β, y) >>> >>>> return y >>> >>>> end >>> >>>> >>> >>>> Thanks for your help! >>> >>>> Madeleine >>> >>>> >>> >>>> >>> >>>> On Wed, Feb 5, 2014 at 1:49 PM, Andreas Noack Jensen >>> >>>> <[email protected]> wrote: >>> >>>>> >>> >>>>> A*b will not call MKL when A is sparse. There has been some >>> writing >>> >>>>> about making a MKL package that overwrites A_mul_B(Matrix,Vector) >>> with the >>> >>>>> MKL versions and I actually wrote wrappers for the sparse MKL >>> subroutines in >>> >>>>> the fall for the same reason. >>> >>>>> >>> >>>>> >>> >>>>> 2014-02-05 Madeleine Udell <[email protected]>: >>> >>>>> >>> >>>>>> Miles, you're right that writing sparse matrix vector products in >>> >>>>>> native Julia probably won't be the best idea given Julia's model >>> of >>> >>>>>> parallelism. That's why I'm interested in calling an outside >>> library like >>> >>>>>> PETSc. >>> >>>>>> >>> >>>>>> I see it's possible to link Julia with MKL. I haven't tried this >>> yet, >>> >>>>>> but if I do, will A*b (where A is sparse) call MKL to perform the >>> matrix >>> >>>>>> vector product? >>> >>>>>> >>> >>>>>> >>> >>>>>> On Wed, Feb 5, 2014 at 11:43 AM, Miles Lubin <[email protected]> >>> >>> >>>>>> wrote: >>> >>>>>>> >>> >>>>>>> Memory access is typically a significant bottleneck in sparse >>> >>>>>>> mat-vec, so unfortunately I'm skeptical that one could achieve >>> good >>> >>>>>>> performance using Julia's current distributed memory approach on >>> a multicore >>> >>>>>>> machine. This really calls for something like OpenMP. >>> >>>>>>> >>> >>>>>>> >>> >>>>>>> On Wednesday, February 5, 2014 11:42:00 AM UTC-5, Madeleine >>> Udell >>> >>>>>>> wrote: >>> >>>>>>>> >>> >>>>>>>> I'm developing an iterative optimization algorithm in Julia >>> along >>> >>>>>>>> the lines of other contributions to the Iterative Solvers >>> project or Krylov >>> >>>>>>>> Subspace module whose only computationally intensive step is >>> computing A*b >>> >>>>>>>> or A'*b. I would like to parallelize the method by using a >>> parallel sparse >>> >>>>>>>> matrix vector multiply. Is there a standard backend >>> matrix-vector multiply >>> >>>>>>>> that's recommended in Julia if I'm targeting a shared memory >>> computer with a >>> >>>>>>>> large number of processors? Similarly, is there a recommended >>> backend for >>> >>>>>>>> targeting a cluster? My matrices can easily reach 10 million >>> rows by 1 >>> >>>>>>>> million columns, with sparsity anywhere from .01% to problems >>> that are >>> >>>>>>>> nearly diagonal. >>> >>>>>>>> >>> >>>>>>>> I've seen many posts talking about integrating PETSc as a >>> backend >>> >>>>>>>> for this purpose, but it looks like the project has stalled - >>> the last >>> >>>>>>>> commits I see are a year ago. I'm also interested in other >>> backends, eg >>> >>>>>>>> Spark, SciDB, etc. >>> >>>>>>>> >>> >>>>>>>> I'm more interested in solving sparse problems, but as a side >>> note, >>> >>>>>>>> the built-in BLAS acceleration by changing the number of >>> threads >>> >>>>>>>> `blas_set_num_threads` works ok for dense problems using a >>> moderate number >>> >>>>>>>> of processors. I wonder why the number of threads isn't set >>> higher than one >>> >>>>>>>> by default, for example, using as many as nprocs() cores? >>> >>>>>> >>> >>>>>> >>> >>>>>> >>> >>>>>> >>> >>>>>> -- >>> >>>>>> Madeleine Udell >>> >>>>>> PhD Candidate in Computational and Mathematical Engineering >>> >>>>>> Stanford University >>> >>>>>> www.stanford.edu/~udell >>> >>>>> >>> >>>>> >>> >>>>> >>> >>>>> >>> >>>>> -- >>> >>>>> Med venlig hilsen >>> >>>>> >>> >>>>> Andreas Noack Jensen >>> >>>> >>> >>>> >>> >>>> >>> >>>> >>> >>>> -- >>> >>>> Madeleine Udell >>> >>>> PhD Candidate in Computational and Mathematical Engineering >>> >>>> Stanford University >>> >>>> www.stanford.edu/~udell >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> -- >>> >>> Med venlig hilsen >>> >>> >>> >>> Andreas Noack Jensen >>> >> >>> >> >>> >> >>> >> >>> >> -- >>> >> Madeleine Udell >>> >> PhD Candidate in Computational and Mathematical Engineering >>> >> Stanford University >>> >> www.stanford.edu/~udell >>> > >>> > >>> > >>> > >>> > -- >>> > Med venlig hilsen >>> > >>> > Andreas Noack Jensen >>> >>> >>> >>> -- >>> Madeleine Udell >>> PhD Candidate in Computational and Mathematical Engineering >>> Stanford University >>> www.stanford.edu/~udell >>> >> > > > -- > Madeleine Udell > PhD Candidate in Computational and Mathematical Engineering > Stanford University > www.stanford.edu/~udell >
