Hello,

This is my first question (I started with julia only two weeks ago and I'm 
really happy).
The problem I'm facing now is a conversion problem when calling the python 
sklearn library.
This is an example of code

X = reshape(rand(20000),(10000,2))
radius = 0.05

using PyCall
@pyimport sklearn.neighbors as nb
@time balltree = nb.BallTree(X,leaf_size=30)
@time result_test = 
pycall(balltree["query_radius"],PyVector{Array{Int64,1}},X,radius);
@time resultJulia_test = copy(result_test);

elapsed time: 0.005120729 seconds (2204 bytes allocated)
elapsed time: 0.093411428 seconds (696 bytes allocated)
elapsed time: 8.069997536 seconds (20968304 bytes allocated, 5.34% gc time)


The conversion to julia takes 100 times compared to the actual calculation.
The next thing I tried to do is iterating the PyVector.

function iterate_Pyvector(vec)
    cont = 0
    for v in vec
        cont += 1
    end
end     

@time iterate_Pyvector(result_test)

elapsed time: 7.865199943 seconds (20736288 bytes allocated)


The iteration of the vector 'result_test' is taking roughly the same time of 
converting it into a julia array.

Moreover there is also a large memory allocation going on during the iteration 
process, which is suggesting me that conversion is actually happening.


Is there a better way to:

A. convert the vector 'result_test'

B. iterate the vector 'result_test'


Any idea is appreciated!

Thanks


Alberto

Reply via email to