Hello everyone,
Recently I'm working on my first Julia project, a word embedding training
program similar to Google's word2vec <https://code.google.com/p/word2vec/> (the
code
of word2vec is indeed very high-quality, but I want to add more features,
so I decided to write a new one). Thanks to Julia's expressiveness, it cost
me less than 2 days to write the entire program. But it runs really slow,
about 100x slower than the C code of word2vec (the algorithm is the same).
I've been trying to optimize my code for several days (adding type
annotations, using BLAS to do computation, eliminating memory allocations
...), but it is still 30x slower than the C code.
The critical part of my program is the following function (it also consumes
most of the time according to the profiling result):
function train_one(c :: LinearClassifier, x :: Array{Float64}, y :: Int64;
α :: Float64 = 0.025, input_gradient :: Union(Nothing, Array{Float64}) =
nothing)
predict!(c, x)
c.outputs[y] -= 1
if input_gradient != nothing
# input_gradient = ( c.weights * outputs' )'
BLAS.gemv!('N', α, c.weights, c.outputs, 1.0, input_gradient)
end
# c.weights -= α * x' * outputs;
BLAS.ger!(-α, vec(x), c.outputs, c.weights)
end
function predict!(c :: LinearClassifier, x :: Array{Float64})
c.outputs = vec(softmax(x * c.weights))
end
type LinearClassifier
k :: Int64 # number of outputs
n :: Int64 # number of inputs
weights :: Array{Float64, 2} # k * n weight matrix
outputs :: Vector{Float64}
end
And the entire program can be found here
<https://github.com/yangzhixuan/embed>. Could you please check my code and
tell me what I can do to get performance comparable to C.
Regards.
Yang Zhixuan