Re: [Scikit-learn-general] k-NN user defined distance

2016-02-23 Thread Shishir Pandey
Hi Jacob I went through the code. The 'fit' method in nearest neighbors does not do any distance calculations. It only initializes the class variables. In that case this is probably not a bug. -- sp On Wed, Feb 24, 2016 at 12:26 AM, Jacob Vanderplas < jake...@cs.washington.edu> wrote: > I have

Re: [Scikit-learn-general] k-NN user defined distance

2016-02-23 Thread Jacob Vanderplas
> > I have been experimenting with the above code. I have noticed the > following things: > > >1. If we set algorithm = 'brute' the algorithm does not enter the >function tan, i.e., putting a breakpoint at the print statement does not >stop execution on it during the fit method. It does

Re: [Scikit-learn-general] k-NN user defined distance

2016-02-23 Thread Shishir Pandey
I have been experimenting with the above code. I have noticed the following things: 1. If we set algorithm = 'brute' the algorithm does not enter the function tan, i.e., putting a breakpoint at the print statement does not stop execution on it during the fit method. It does however use t

Re: [Scikit-learn-general] k-NN user defined distance

2016-01-13 Thread Sebastian Raschka
I guess I got it now! This behavior (see below) is indeed a bit strange: from sklearn.neighbors import NearestNeighbors import numpy as np X = np.array([[1.0, 0.0, 1.0, 1.0], [0.0, 0.0, 1.0, 0.0], [1.0, 1.0, 1.0, 1.0]]) def tan(x, y): print(y) return 1 nbrs = NearestNeighbors(n_neighbor

Re: [Scikit-learn-general] k-NN user defined distance

2016-01-12 Thread Herbert Schulz
ps. 1. I printed the x,y array. And i thougtif these is the output: [ 0.49178495 0.44239588 0.43451225 0.40576958 0.82022061 0.02921787 0.08832147 0.43397282 0.15083042 0.49916182] [ 0.49178495 0.44239588 0.43451225 0.40576958 0.82022061 0.02921787 0.08832147 0.43397282 0.15

Re: [Scikit-learn-general] k-NN user defined distance

2016-01-12 Thread Herbert Schulz
Sorry that i coudln't explained it very well I thought that X = np.array([[1.0, 0.0, 1.0, 1.0], [0.0, 0.0, 1.0, 0.0], [1.0, 1.0, 1.0, 1.0]]) def tan(x, y): print x,y c=np.sum(x==y) a1 = x[x == 1.0].shape[0] b1 = y[y == 1.0].shape[0] return float(c)/(a1 + b1 - c) example

Re: [Scikit-learn-general] k-NN user defined distance

2016-01-12 Thread Sebastian Raschka
Hi, Herbert, sorry, but I am still a bit confused about what you are trying to accomplish when you say > and the output is then what i mentioned > > x are only floats (0.573... ) and B are containing 1's and 0's like it should When I run it on a small test dataset ... from sklearn.neighbors

Re: [Scikit-learn-general] k-NN user defined distance

2016-01-12 Thread Herbert Schulz
Here is an example code, where the failure occurs. sorry for the big tests vector, couldn't show it otherwise. import numpy as np from sklearn.neighbors import NearestNeighbors def tanimoto(x,y): print "X OUTPUT\n ",x,"B OUTPUT\n",y c=np.sum(x==y) a1 = np.sum(x) b1 = np.sum

Re: [Scikit-learn-general] k-NN user defined distance

2016-01-12 Thread Herbert Schulz
The X array which I'm using to fit for the kNN algorithm is n_samples x n features big. but for calculating the distance or in this case the tanimoto. i thought tanimot(x,y) if I call it with : KNeighborsClassifier(metric=tanimoto).fit(X_train,y_train) is, x= is in sample and y= is one sample

Re: [Scikit-learn-general] k-NN user defined distance

2016-01-12 Thread Sebastian Raschka
I see, as far as I know, your X input array should be a n_samples x n_features array. Is this true in your case? Btw. instead of converting the inputs to Python lists , you could also get the counts via a1 = x[x == 1.0].shape[0] b1 = y[y == 1.0].shape[0] And maybe a few extra lines assert se

Re: [Scikit-learn-general] k-NN user defined distance

2016-01-12 Thread Herbert Schulz
Yes, it is a number between 0 and 1. but im calculating it, depended on 2 samples. So x and y. And if im counting the 1's in x and the 1's in y, i should get the coef with return float(c)/(a1 + b1 - c) But i cannot count 1's in x if x is a skalar. -

Re: [Scikit-learn-general] k-NN user defined distance

2016-01-12 Thread Sebastian Raschka
Isn't the tanimoto coeff a continuous number between 0 and 1? > how are the skalars are occring in the X vektor? You are returning a fraction -> "return float(c)/(a1 + b1 - c)" instead of the count, maybe that's an misunderstanding? > On Jan 12, 2016, at 1:24 PM, A neuman wrote: > > The

Re: [Scikit-learn-general] k-NN user defined distance

2016-01-12 Thread A neuman
The custom metric, ist just calculating the tanimoto coef. a=x.tolist() b=y.tolist() c=np.count_nonzero(x==y) a1=a.count(1.0) b1=b.count(1.0) return float(c)/(a1 + b1 - c) so im Just counting 1's in x and 1's in y c= are the numer, where 1's are matching ( matching ==

Re: [Scikit-learn-general] k-NN user defined distance

2016-01-12 Thread Sebastian Raschka
Hi, I am not sure how your custom metric works, but would a np.where(x >= 0.5, 1., 0.) work in your case? > On Jan 12, 2016, at 1:08 PM, A neuman wrote: > > Sorry, thats not right what I wrote: > X: > [ 0.6371319 0.54557285 0.30214217 0.14690307 0.49778446 0.89183238 > 0.52445514 0.633

Re: [Scikit-learn-general] k-NN user defined distance

2016-01-12 Thread A neuman
Sorry, thats not right what I wrote: X: [ 0.6371319 0.54557285 0.30214217 0.14690307 0.49778446 0.89183238 0.52445514 0.63379164 0.71873681 0.55008567] Y: [ 0.6371319 0.54557285 0.30214217 0.14690307 0.49778446 0.89183238 0.52445514 0.63379164 0.71873681 0.55008567] X: [ 0.

Re: [Scikit-learn-general] k-NN user defined distance

2016-01-12 Thread A neuman
Hey, I Have an another problem, if I'm using my own metric, there are not only the samples in x and y. I'm using a 10 fold cv with k-NN Classifier. My Attributes are only 1's and 0's, but if im printing them out, I'll get: KNeighborsClassifier(metric=myFunc) def myFunc(x,y): print x,'\n'

Re: [Scikit-learn-general] k-NN user defined distance

2016-01-08 Thread A neuman
Ah, that helped me a lot!!! So i just write my own function that returns an skalar. This function is used in the metric parameter of the kNN function. Thank you!!! On 9 January 2016 at 03:41, Sebastian Raschka wrote: > You could just need “regular" Python function that outputs a scalar. For >

Re: [Scikit-learn-general] k-NN user defined distance

2016-01-08 Thread Sebastian Raschka
You could just need “regular" Python function that outputs a scalar. For example, consider the following example: >>> from sklearn.neighbors import NearestNeighbors >>> import numpy as np >>> X = np.array([[-1, -1], [-2, -1], [-3, -2], [1, 1], [2, 1], [3, 2]]) >>> nbrs = NearestNeighbors(n_neighb

[Scikit-learn-general] k-NN user defined distance

2016-01-08 Thread A neuman
Hello everyone, I actually want to use the KNeighboursClassifier, with my own distances. in the Documentation stands the following: [callable] : a user-defined function which accepts an array of distances, and returns an array of the same shape containing the weights. I just dont know, how shou