Re: [Scikit-learn-general] Pearson Correlation Similarity Measure

2015-03-23 Thread Michael Eickenberg
On Monday, March 23, 2015, Gael Varoquaux gael.varoqu...@normalesup.org wrote: On Mon, Mar 23, 2015 at 10:27:00AM +0530, Vinayak Mehta wrote: I believe that it is the same thing as cosine similarity. If that's indeed the case, you could add a note in the cosine similarity docstring to

Re: [Scikit-learn-general] Pearson Correlation Similarity Measure

2015-03-23 Thread Mathieu Blondel
The cosine similarity and Pearson correlation are the same if the data is centered but are different in general. The routine in SciPy is between two vectors; metrics in scikit-learn are between matrices. So +1 to add Pearson correlation to scikit-learn. On Mon, Mar 23, 2015 at 3:24 PM, Gael

Re: [Scikit-learn-general] [GSoC] Metric Learning

2015-03-23 Thread Andreas Mueller
On 03/21/2015 08:54 PM, Artem wrote: Are there any objections on Joel's variant of y? It serves my needs, but is quite different from what one can usually find in scikit-learn. -- Another point I want to bring up is metric-aware KMeans. Currently it works with Euclidean distance only,

Re: [Scikit-learn-general] [GSoC 2015] Cross-validation and Meta-Estimators for semi-supervised learning

2015-03-23 Thread Andreas Mueller
Have you had a look at the issues tagged easy? On 03/22/2015 05:47 PM, Boyuan Deng wrote: Hi all: This is the link to my proposal for the Cross-validation and Meta-estimators for Semi-supervised Learning topic:

Re: [Scikit-learn-general] GSoC 2015 Proposal: Multiple Metric Learning

2015-03-23 Thread Andreas Mueller
On 03/22/2015 07:57 PM, Raghav R V wrote: 2. Given that there is a huge interest among students in learning about ML, do you think it would be within the scope of/beneficial to skl to have all the exercises and/or concepts, from a good quality book (ESL / PRML / Murphy) or an academic

Re: [Scikit-learn-general] GSoC 2015 Proposal: Multiple Metric Learning

2015-03-23 Thread Matthieu Brucher
For practical purposes, I currently know of 2 (3?) sklearn books published with PACKT. There is also an OReilly book coming up: http://shop.oreilly.com/product/0636920030515.do 2 general books, 1 cookbook and I think there is another one half-written as well. Didn't know about O'Reilly, good

Re: [Scikit-learn-general] Pearson Correlation Similarity Measure

2015-03-23 Thread Vinayak Mehta
@Gael I believe that it is the same thing as cosine similarity. If that's indeed the case, you could add a note in the cosine similarity docstring to stress it. I think it is somewhat different from cosine similarity. Then you'll have to tell me how, because I am being dense and I

Re: [Scikit-learn-general] Pearson Correlation Similarity Measure

2015-03-23 Thread Gael Varoquaux
On Mon, Mar 23, 2015 at 07:56:33AM +0100, Michael Eickenberg wrote: I think it is somewhat different from cosine similarity. Then you'll have to tell me how, because I am being dense and I don't see the difference. Both are scalar products of two normalized data vectors.

Re: [Scikit-learn-general] Pearson Correlation Similarity Measure

2015-03-23 Thread Boyuan Deng
Hi Vinayak: scipy.stats implemented pearsonr() like that because it's a statistics routine. It treats 0 in the input data as indeed value 0. But in the context of recommender systems, unrated is different from score 0 (though we usually use 0 to represent unrated when score must be

Re: [Scikit-learn-general] GSoC 2015 Proposal: Multiple Metric Learning

2015-03-23 Thread Andreas Mueller
can you please also upload it to melange? On 03/22/2015 08:52 PM, Raghav R V wrote: 2 things : * The subject should have been Multiple Metric Support in grid_search and cross_validation modules and other general improvements and not multiple metric learning! Sorry for that! * The link was

Re: [Scikit-learn-general] Question regarding the list of topics for GSoC 2015

2015-03-23 Thread Andreas Mueller
Hi Vinayak. Have you decided on your application topic? I am trying to get a bit of an overview, and I think you haven't submitted anything yet. There are two other applications for the hyperparameter topic and one for the cross-validation and gridsearch improvements. Since Ragv is already

Re: [Scikit-learn-general] GSoC2015 Hyperparameter Optimization topic

2015-03-23 Thread Andreas Mueller
Hi Christof. Can you please also post it on melange? Reviews will be coming soon ;) Andy On 03/19/2015 05:12 PM, Christof Angermueller wrote: Hi All, you can find my proposal for the hyperparameter optimization topic here: * http://goo.gl/XHuav8 *

Re: [Scikit-learn-general] [GSoC] Metric Learning

2015-03-23 Thread Artem
Theoretical justifications of using kernel PCA is that the data needs to be projected onto span of eigenvectors of a covariance matrix (section 3.1.4 of Kulis' survey http://web.cse.ohio-state.edu/~kulis/pubs/ftml_metric_learning.pdf). Does kernel approximation whiten the data? Either way,

Re: [Scikit-learn-general] GSoC 2015 Proposal: Multiple Metric Learning

2015-03-23 Thread Raghav R V
Thanks for all the good comments!! I'll replace that section of my proposal with some other more important work! :) On Mon, Mar 23, 2015 at 7:53 PM, Matthieu Brucher matthieu.bruc...@gmail.com wrote: For practical purposes, I currently know of 2 (3?) sklearn books published with PACKT.

[Scikit-learn-general] Kernel PCA .fit() Failing Silently

2015-03-23 Thread Stephen O'Neill
Hi Sklearn, I'm using Kernel PCA with the rbf kernel for projecting data into 3 dimensions for viewing alongside normal PCA and a stereographic projection class that I wrote myself. Both the PCA and SGP classes seem to be functioning correctly on this data set, but when I get to the .fit()

Re: [Scikit-learn-general] update liblinear

2015-03-23 Thread Andreas Mueller
I am not aware of anyone tracking liblinear. There is certainly no automatic update. On 03/23/2015 08:05 PM, Charles Martin wrote: On liblinear--can you clarify for me how you incorporate updates from the main site? Do you make an effort to stay up to date with latest changes directly by

Re: [Scikit-learn-general] Question regarding the list of topics for GSoC 2015

2015-03-23 Thread Vlad Niculae
Hi Vinayak, The wiki page just lists a subset of possible topics for which candidates already showed concrete interest. I think an application for low-rank matrix completion would be more than welcome. It’s very important to work on a topic that you are interested in directly, versus just

Re: [Scikit-learn-general] [GSoC] Metric Learning

2015-03-23 Thread Artem
Hi Aurélien Thanks for your comments! Can you say anything on kernelization as part of a model, not KPCA? I'm especially interested in a kernelized version of ITML. I think, kernel metric learning methods don't scale well, since one has to work a huge matrix of size n_samples x n_samples, which

Re: [Scikit-learn-general] Question regarding the list of topics for GSoC 2015

2015-03-23 Thread Artem
It's worth noting that there was a similar project https://github.com/scikit-learn/scikit-learn/pull/2387 2 years ago, but unfortunately it wasn't completed. I made some work upon that, but I didn't get any feedback. On Tue, Mar 24, 2015 at 3:23 AM, Vlad Niculae zephy...@gmail.com wrote: Hi

Re: [Scikit-learn-general] Question regarding the list of topics for GSoC 2015

2015-03-23 Thread Vlad Niculae
Very good points, Artem! The PR you link to contains important discussion on API issues. I’m sorry I missed your PR. On 23 Mar 2015, at 20:33, Artem barmaley@gmail.com wrote: It's worth noting that there was a similar project 2 years ago, but unfortunately it wasn't completed. I made