Hello,
This is my first question (I started with julia only two weeks ago and I'm
really happy).
The problem I'm facing now is a conversion problem when calling the python
sklearn library.
This is an example of code
X = reshape(rand(20000),(10000,2))
radius = 0.05
using PyCall
@pyimport sklearn.neighbors as nb
@time balltree = nb.BallTree(X,leaf_size=30)
@time result_test =
pycall(balltree["query_radius"],PyVector{Array{Int64,1}},X,radius);
@time resultJulia_test = copy(result_test);
elapsed time: 0.005120729 seconds (2204 bytes allocated)
elapsed time: 0.093411428 seconds (696 bytes allocated)
elapsed time: 8.069997536 seconds (20968304 bytes allocated, 5.34% gc time)
The conversion to julia takes 100 times compared to the actual calculation.
The next thing I tried to do is iterating the PyVector.
function iterate_Pyvector(vec)
cont = 0
for v in vec
cont += 1
end
end
@time iterate_Pyvector(result_test)
elapsed time: 7.865199943 seconds (20736288 bytes allocated)
The iteration of the vector 'result_test' is taking roughly the same time of
converting it into a julia array.
Moreover there is also a large memory allocation going on during the iteration
process, which is suggesting me that conversion is actually happening.
Is there a better way to:
A. convert the vector 'result_test'
B. iterate the vector 'result_test'
Any idea is appreciated!
Thanks
Alberto