In addition to the suggestions from the nice people who are taking the time to respond to your question, see also http://docs.julialang.org/en/release-0.3/manual/performance-tips/
--Tim On Thursday, February 19, 2015 06:51:20 AM Zhixuan Yang wrote: > Hello everyone, > > Recently I'm working on my first Julia project, a word embedding training > program similar to Google's word2vec <https://code.google.com/p/word2vec/> > (the code of word2vec is indeed very high-quality, but I want to add more > features, so I decided to write a new one). Thanks to Julia's > expressiveness, it cost me less than 2 days to write the entire program. > But it runs really slow, about 100x slower than the C code of word2vec (the > algorithm is the same). I've been trying to optimize my code for several > days (adding type annotations, using BLAS to do computation, eliminating > memory allocations ...), but it is still 30x slower than the C code. > > The critical part of my program is the following function (it also consumes > most of the time according to the profiling result): > > function train_one(c :: LinearClassifier, x :: Array{Float64}, y :: Int64; > α :: Float64 = 0.025, input_gradient :: Union(Nothing, Array{Float64}) = > nothing) > predict!(c, x) > c.outputs[y] -= 1 > > if input_gradient != nothing > # input_gradient = ( c.weights * outputs' )' > BLAS.gemv!('N', α, c.weights, c.outputs, 1.0, input_gradient) > end > > # c.weights -= α * x' * outputs; > BLAS.ger!(-α, vec(x), c.outputs, c.weights) > end > > function predict!(c :: LinearClassifier, x :: Array{Float64}) > c.outputs = vec(softmax(x * c.weights)) > end > > type LinearClassifier > k :: Int64 # number of outputs > n :: Int64 # number of inputs > weights :: Array{Float64, 2} # k * n weight matrix > > outputs :: Vector{Float64} > end > > And the entire program can be found here > <https://github.com/yangzhixuan/embed>. Could you please check my code and > tell me what I can do to get performance comparable to C. > > Regards. > Yang Zhixuan
