This is an excellent use case for the @code_warntype macro. You can quickly 
see that rowval, n, x, r, pushval, nzi, v, rvold, rvnew, and several 
temporary variables are ending up as Any or underspecified Array type. A 
few things you can do to help:

- give the A field of the PageRankProblem type more specific type 
parameters, A::SparseMatrixCSC{Float64, Int64} so the type of rowval can be 
inferred
- change n = size(A,1) to n = size(P.A, 1); otherwise you're referring to 
the global variable A
- use the colptr version instead of length(nzrange(A,u)) for du, and 
similarly for dv - for some reason the length of the output of nzrange 
can't be inferred as an Int, this might be a bug worth reporting

Those 3 changes get it to within a factor of 2 of C++ on my machine.

-Tony


On Friday, April 3, 2015 at 6:22:43 PM UTC-7, David Gleich wrote:
>
> I'm trying to figure out why the following algorithm is so slow in Julia. 
>
> $ julia pprpush_perf4.jl
> elapsed time: 0.754419672 seconds # first run, no compiled
> elapsed time: 0.627998826 seconds # second run, compiled
> elapsed time: 0.616043857 seconds (292 MB allocated, 1.52% gc time in 14 
> pauses with 0 full sweep) # third run, with @time
> True = 0.026124014066724 # make sure the result is correct
> x[0] = 0.026124014066724
>
> $ rm a.out; g++ -O2 pprpush_perf.cpp; ./a.out 
>
> Elapsed 0.010623 seconds. # second run of the algorithm in C++
>
> True = 0.026124014066724 # make sure the result is correct
> x[0] = 0.026124014066724
> y[0] = 0.026124014066724
>
> So Julia is ~60x slower than the equivalent C++ code.
>
> I've attached the code for both procedures for the complete code, but 
> here's the key algorithm. This is the PageRank "push" procedure due to 
> Andersen, Chung, and Lang (FOCS 2006).
>
> Basically, it does random access to the rows or columns of a sparse matrix 
> (whichever is faster, because it assumes the sparse matrix is the adjacency 
> matrix of an undirected graph).
>
> I tried a few different things to speed it up in Julia, but nothing seemed 
> to work. Any thoughts?
>
> function ppr_push(P::PageRankProblem, eps::Float64)
>     # extract the sparse data structure
>     #colptr = P.A.colptr
>     rowval = rowvals(P.A) # P.A is just a julia sparse matrix
>     n = size(A,1)
>     
>     # create the initial solution and residual
>     x = zeros(n,1)
>     r = zeros(n,1)
>     
>     r[P.seed] = 1.
>     
>     q = [P.seed]
>     
>     pushcount = 0
>     pushvol = 0
>     maxpush = 1./(eps*(1.-P.alpha))
>     
>     while length(q) > 0 && pushcount <= maxpush
>         pushcount += 1
>         u = shift!(q) # I've tried DataStructures Queue here, but no change
>         #du = convert(Float64,colptr[u+1]-colptr[u]) # get the degree
>         #du = Float64(colptr[u+1]-colptr[u])
>         du = Float64(length(nzrange(A,u)))
>         pushval = r[u] - 0.5*eps*du
>         x[u] += (1-P.alpha)*pushval
>         r[u] = 0.5*eps*du
>         
>         #@printf("pushing on %4i with resid %.9f rho=%f\n", u, pushval, 
> rho)
>         
>         pushval = pushval*P.alpha/du
>         
>         #for nzi in colptr[u]:(colptr[u+1] - 1)
>         for nzi in nzrange(P.A,u)
>             pushvol += 1
>             v = rowval[nzi]
>             #dv = convert(Float64,colptr[v+1]-colptr[v]) # degree of v
>             dv = Float64(length(nzrange(A,v)))
>             rvold = r[v]
>             rvnew = rvold + pushval
>             r[v] = rvnew
>             if rvnew > eps*dv && rvold <= eps*dv
>                 push!(q,v)
>             end
>         end
>     end
>     
>     return x, r, pushcount, pushvol
>     
> end    
>
> Thanks,
> David Gleich
>

Reply via email to