Hello everyone, 

Recently I'm working on my first Julia project, a word embedding training 
program similar to Google's word2vec <https://code.google.com/p/word2vec/> (the 
code 
of word2vec is indeed very high-quality, but I want to add more features, 
so I decided to write a new one). Thanks to Julia's expressiveness, it cost 
me less than 2 days to write the entire program. But it runs really slow, 
about 100x slower than the C code of word2vec (the algorithm is the same). 
 I've been trying to optimize my code for several days (adding type 
annotations, using BLAS to do computation, eliminating memory allocations 
...), but it is still 30x slower than the C code. 

The critical part of my program is the following function (it also consumes 
most of the time according to the profiling result):

function train_one(c :: LinearClassifier, x :: Array{Float64}, y :: Int64; 
α :: Float64 = 0.025, input_gradient :: Union(Nothing, Array{Float64}) = 
nothing)
    predict!(c, x)
    c.outputs[y] -= 1

    if input_gradient != nothing
        # input_gradient = ( c.weights * outputs' )'
        BLAS.gemv!('N', α, c.weights, c.outputs, 1.0, input_gradient)
    end

    # c.weights -= α * x' * outputs;
    BLAS.ger!(-α, vec(x), c.outputs, c.weights)
end

function predict!(c :: LinearClassifier, x :: Array{Float64})
    c.outputs = vec(softmax(x * c.weights))
end

type LinearClassifier
    k :: Int64 # number of outputs
    n :: Int64 # number of inputs
    weights :: Array{Float64, 2} # k * n weight matrix

    outputs :: Vector{Float64}
end

And the entire program can be found here 
<https://github.com/yangzhixuan/embed>. Could you please check my code and 
tell me what I can do to get performance comparable to C. 

Regards.
Yang Zhixuan

Reply via email to