On Wednesday, September 17, 2014 4:56:46 AM UTC-4, Alberto wrote:
>
> @pyimport sklearn.neighbors as nb
> @time balltree = nb.BallTree(X,leaf_size=30)
> @time result_test =
> pycall(balltree["query_radius"],PyVector{Array{Int64,1}},X,radius);
> @time resultJulia_test = copy(result_test);
>
> elapsed time: 0.005120729 seconds (2204 bytes allocated)
> elapsed time: 0.093411428 seconds (696 bytes allocated)
> elapsed time: 8.069997536 seconds (20968304 bytes allocated, 5.34% gc time)
>
>
> The conversion to julia takes 100 times compared to the actual calculation.
>
As a general rule, calling Python functions for small computations/lookups
within your inner loop is going to be slow, for the same reason that it is
slow in Python. The only way to get good performance from Python is to
perform large computations (or small computations on lots of data) in a
single call that calls out to C (etc.) code.
Unfortunately, when you iterate over the query_radius data structure, you
are essentially calling the equivalent of result[i][j] for every index i
and j. Since result[i][j] is a Python function, you have Python calls in
your inner loop (as well as memory allocation in order to allocate Julia
wrappers around Python objects).
(Normally, large arrays can be passed back and forth via NumPy arrays,
which have low overhead because all the Python queries are done once and
then we just get a pointer to the array data. In this case, however, the
query_radius function returns an array of arrays, I guess because result[i]
may be of different lengths, so we necessarily have to do some Python calls
for every one of the 10000 elements of the result array.)
That being said, there is something funny going on here. I wrote two
different versions of your routine:
function query_radius1(balltree::PyObject, X, radius)
pyind = pycall(balltree["query_radius"],PyVector{PyObject},X,radius)
return Vector{Int}[convert(Vector{Int}, o) for o in pyind]
end
function query_radius2(balltree::PyObject, X, radius)
pyind = pycall(balltree["query_radius"],PyVector{PyObject},X,radius)
return Vector{Int}[copy(PyVector{Int}(o)) for o in pyind]
end
that should be very similar in performance, but the second version is 5x
faster on my machine. I'll have to look into it.