Re: [scikit-learn] Nearest neighbor search with 2 distance measures

2017-08-01 Thread Rohin Kumar
Since you seem to be from Astrophysics/Cosmology background (I am assuming you are jakevdp - the creator of astroML - if you are - I am lucky!), I can explain my application scenario. I am trying to calculate the anisotropic two-point correlation function something like done in rp_pi_tpcf

Re: [scikit-learn] Nearest neighbor search with 2 distance measures

2017-08-01 Thread Rohin Kumar
Dear Jake, Thanks for your response. I meant to group/count pairs in boxes (using two arrays simultaneously-hence needing 2 metrics) instead of one distance array as the binning parameter. I don't know if the algorithm supports such a thing. For now, I am proceeding with your suggestion of two bal

[scikit-learn] question about class_weights in LogisticRegression

2017-08-01 Thread Johnson, Jeremiah
Hello all, I'm looking for confirmation on an implementation detail that is somewhere in liblinear, but I haven't found documentation for yet. When the class_weights='balanced' parameter is set in LogisticRegression, then the regularisation parameter for an observation from class I is class_wei

Re: [scikit-learn] question about class_weights in LogisticRegression

2017-08-01 Thread Stuart Reynolds
I hope not. And not accoring to the docs... https://github.com/scikit-learn/scikit-learn/blob/ab93d65/sklearn/linear_model/logistic.py#L947 class_weight : dict or 'balanced', optional Weights associated with classes in the form ``{class_label: weight}``. If not given, all classes are supposed to h

Re: [scikit-learn] question about class_weights in LogisticRegression

2017-08-01 Thread Johnson, Jeremiah
Right, I know how the class_weight calculation is performed. But then those class weights are utilized during the model fit process in some way in liblinear, and that¹s what I am interested in. libSVM does class_weight[I] * C (https://www.csie.ntu.edu.tw/~cjlin/libsvm/); is the implementation in li

Re: [scikit-learn] Nearest neighbor search with 2 distance measures

2017-08-01 Thread Jacob Vanderplas
Hi Rohin, Ah, I see. I don't think a BallTree is the right data structure for an anisotropic N-point query, because it fundamentally assumes spherical symmetry of the metric. You may be able to do something like this with a specialized KD-tree, but scikit-learn doesn't support this, and I don't ima

Re: [scikit-learn] Nearest neighbor search with 2 distance measures

2017-08-01 Thread Rohin Kumar
Dear Jake, Thank you for your prompt reply. I started with KD-tree but after realising it doesn't support custom metrics (I don't know the reason for this - would be nice feature) I shifted to BallTree and was looking for a 2 metric based categorisation. After looking around, the best I could find

Re: [scikit-learn] Nearest neighbor search with 2 distance measures

2017-08-01 Thread Jacob Vanderplas
On Tue, Aug 1, 2017 at 10:50 AM, Rohin Kumar wrote: > I started with KD-tree but after realising it doesn't support custom > metrics (I don't know the reason for this - would be nice feature) > The scikit-learn KD-tree doesn't support custom metrics because it utilizes relatively strong assumpti