In the current version of my function, not a single temporary array is 
created. And the portion of time used by GC is less than 5% (reported by 
the @time macro).  So I think the new  GC algorithm will not help too much.

在 2015年2月22日星期日 UTC+8上午1:30:10,Viral Shah写道:
>
> It is also worth trying out one of the 0.4-dev nightlies and compare the 
> performance. The code does avoid creating temporaries to a large extent, 
> but it may be worth checking if the new GC helps. 
>
> -viral 
>
>
>
> > On 21-Feb-2015, at 10:09 pm, [email protected] <javascript:> wrote: 
> > 
> > What's the type of c.outputs? In train_one it seems to be Int64, in 
> prdict! it seems to be Float64. 
> > 
> > On Thursday, February 19, 2015 at 3:51:20 PM UTC+1, Zhixuan Yang wrote: 
> > Hello everyone, 
> > 
> > Recently I'm working on my first Julia project, a word embedding 
> training program similar to Google's word2vec (the code of word2vec is 
> indeed very high-quality, but I want to add more features, so I decided to 
> write a new one). Thanks to Julia's expressiveness, it cost me less than 2 
> days to write the entire program. But it runs really slow, about 100x 
> slower than the C code of word2vec (the algorithm is the same).  I've been 
> trying to optimize my code for several days (adding type annotations, using 
> BLAS to do computation, eliminating memory allocations ...), but it is 
> still 30x slower than the C code. 
> > 
> > The critical part of my program is the following function (it also 
> consumes most of the time according to the profiling result): 
> > 
> > function train_one(c :: LinearClassifier, x :: Array{Float64}, y :: 
> Int64; α :: Float64 = 0.025, input_gradient :: Union(Nothing, 
> Array{Float64}) = nothing) 
> >     predict!(c, x) 
> >     c.outputs[y] -= 1 
> > 
> >     if input_gradient != nothing 
> >         # input_gradient = ( c.weights * outputs' )' 
> >         BLAS.gemv!('N', α, c.weights, c.outputs, 1.0, input_gradient) 
> >     end 
> > 
> >     # c.weights -= α * x' * outputs; 
> >     BLAS.ger!(-α, vec(x), c.outputs, c.weights) 
> > end 
> > 
> > function predict!(c :: LinearClassifier, x :: Array{Float64}) 
> >     c.outputs = vec(softmax(x * c.weights)) 
> > end 
> > 
> > type LinearClassifier 
> >     k :: Int64 # number of outputs 
> >     n :: Int64 # number of inputs 
> >     weights :: Array{Float64, 2} # k * n weight matrix 
> > 
> >     outputs :: Vector{Float64} 
> > end 
> > 
> > And the entire program can be found here. Could you please check my code 
> and tell me what I can do to get performance comparable to C. 
> > 
> > Regards. 
> > Yang Zhixuan 
>
>

Reply via email to