On Wednesday, September 17, 2014 4:56:46 AM UTC-4, Alberto wrote:
>
> @pyimport sklearn.neighbors as nb
> @time balltree = nb.BallTree(X,leaf_size=30)
> @time result_test = 
> pycall(balltree["query_radius"],PyVector{Array{Int64,1}},X,radius);
> @time resultJulia_test = copy(result_test);
>
> elapsed time: 0.005120729 seconds (2204 bytes allocated)
> elapsed time: 0.093411428 seconds (696 bytes allocated)
> elapsed time: 8.069997536 seconds (20968304 bytes allocated, 5.34% gc time)
>
>
> The conversion to julia takes 100 times compared to the actual calculation.
>

As a general rule, calling Python functions for small computations/lookups 
within your inner loop is going to be slow, for the same reason that it is 
slow in Python.  The only way to get good performance from Python is to 
perform large computations (or small computations on lots of data) in a 
single call that calls out to C (etc.) code.

Unfortunately, when you iterate over the query_radius data structure, you 
are essentially calling the equivalent of result[i][j] for every index i 
and j.  Since  result[i][j] is a Python function, you have Python calls in 
your inner loop (as well as memory allocation in order to allocate Julia 
wrappers around Python objects).

(Normally, large arrays can be passed back and forth via NumPy arrays, 
which have low overhead because all the Python queries are done once and 
then we just get a pointer to the array data.   In this case, however, the 
query_radius function returns an array of arrays, I guess because result[i] 
may be of different lengths, so we necessarily have to do some Python calls 
for every one of the 10000 elements of the result array.)

That being said, there is something funny going on here.  I wrote two 
different versions of your routine:

function query_radius1(balltree::PyObject, X, radius)
    pyind = pycall(balltree["query_radius"],PyVector{PyObject},X,radius)
    return Vector{Int}[convert(Vector{Int}, o) for o in pyind]
end

function query_radius2(balltree::PyObject, X, radius)
    pyind = pycall(balltree["query_radius"],PyVector{PyObject},X,radius)
    return Vector{Int}[copy(PyVector{Int}(o)) for o in pyind]
end

that should be very similar in performance, but the second version is 5x 
faster on my machine.  I'll have to look into it.

Reply via email to