Re: [Scikit-learn-general] GraphLasso

2012-02-20 Thread Olivier Grisel
2012/2/20 David Warde-Farley : > On 2012-02-20, at 3:46 AM, Olivier Grisel wrote: > >> How high dimensional is this? GraphLasso works on the empirical >> covariance matrix which is implemented as an 2D numpy array with shape >> (n_features, n_features). It won't fit in memory for n_features > >> 10

Re: [Scikit-learn-general] GraphLasso

2012-02-20 Thread David Warde-Farley
On 2012-02-20, at 3:46 AM, Olivier Grisel wrote: > How high dimensional is this? GraphLasso works on the empirical > covariance matrix which is implemented as an 2D numpy array with shape > (n_features, n_features). It won't fit in memory for n_features > > 1 and it might be intractably too lo

Re: [Scikit-learn-general] GraphLasso

2012-02-20 Thread Mathias Verbeke
Thanks! And does it make sense to use L1 regularisation here (irrespective of the graph structure)? Best, Mathias On Mon, Feb 20, 2012 at 11:06 AM, Gael Varoquaux < [email protected]> wrote: > On Mon, Feb 20, 2012 at 10:35:51AM +0100, Mathias Verbeke wrote: > > I would have around

Re: [Scikit-learn-general] GraphLasso

2012-02-20 Thread Gael Varoquaux
On Mon, Feb 20, 2012 at 10:35:51AM +0100, Mathias Verbeke wrote: > I would have around 1 features. I'm working on a sentence > classification problem, Graph lasso won't work on such a problem. > I would like to do feature selection, to reduce the number of > dimensions, and was thinking to ta

Re: [Scikit-learn-general] GraphLasso

2012-02-20 Thread Mathias Verbeke
Hi Olivier, Thanks for the fast reply! > I have a high-dimensional feature set, where the features originate from > > graphs. I was wondering if the use of GraphLasso applies and would be a > good > > idea in this case? And if it would be, can I then just apply it on the > > feature vectors or do

Re: [Scikit-learn-general] GraphLasso

2012-02-20 Thread Olivier Grisel
2012/2/20 Mathias Verbeke : > Hi all, > > I have a high-dimensional feature set, where the features originate from > graphs. I was wondering if the use of GraphLasso applies and would be a good > idea in this case? And if it would be, can I then just apply it on the > feature vectors or do I need t

[Scikit-learn-general] GraphLasso

2012-02-20 Thread Mathias Verbeke
Hi all, I have a high-dimensional feature set, where the features originate from graphs. I was wondering if the use of GraphLasso applies and would be a good idea in this case? And if it would be, can I then just apply it on the feature vectors or do I need to input the originating graph structure

Re: [Scikit-learn-general] GraphLasso pull request and feature

2011-11-11 Thread Mathieu Blondel
On Sat, Nov 12, 2011 at 12:28 AM, Gael Varoquaux wrote: > I don't like the results as much. Basically, we can either cluster on > conditional relations, or on marginal relations: basically, as an input > of affinity propagation, we can use the correlation matrix, which is a > standard affinity me

Re: [Scikit-learn-general] GraphLasso pull request and feature

2011-11-11 Thread Mathieu Blondel
On Sat, Nov 12, 2011 at 12:33 AM, Olivier Grisel wrote: > Indeed I thought the same, but as the current result is already good / > interesting ... We can see that Yahoo is clustered with Amazon and Apple, although it would have been clustered with Cisco, Dell and IBM if the clustering had been m

Re: [Scikit-learn-general] GraphLasso pull request and feature

2011-11-11 Thread Olivier Grisel
2011/11/11 Mathieu Blondel : > GraphLasso seems really neat (and the associated CV object should > prove very useful). > > I had a look at the stock market example but I am a bit confused by > the fact that the clustering, graph structure and 2d-embedding seemed > to be learned independently althou

Re: [Scikit-learn-general] GraphLasso pull request and feature

2011-11-11 Thread Gael Varoquaux
On Sat, Nov 12, 2011 at 12:19:37AM +0900, Mathieu Blondel wrote: > GraphLasso seems really neat (and the associated CV object should > prove very useful). Thanks! > I see that the (dense) correlation matrix is used as input to affinity > propagation. Wouldn't it be better if we used the partial c

Re: [Scikit-learn-general] GraphLasso pull request and feature

2011-11-11 Thread Mathieu Blondel
GraphLasso seems really neat (and the associated CV object should prove very useful). I had a look at the stock market example but I am a bit confused by the fact that the clustering, graph structure and 2d-embedding seemed to be learned independently although they are clearly related problems. I

Re: [Scikit-learn-general] GraphLasso pull request and feature

2011-11-09 Thread Gael Varoquaux
On Thu, Nov 10, 2011 at 12:06:15AM +0100, Gael Varoquaux wrote: > On Wed, Nov 09, 2011 at 11:43:40PM +0100, bthirion wrote: > > > What do people think? Should I: > > > 1. change graph_lasso to take the empirical covariance as an input > > > 2. add an 'X_is_cov' parameter to the estimators > >

Re: [Scikit-learn-general] GraphLasso pull request and feature

2011-11-09 Thread Gael Varoquaux
On Wed, Nov 09, 2011 at 09:10:46PM -0500, Alexandre Gramfort wrote: > to complexify a bit the pb note that in the SVM/Lasso/... case the > precomputed gram > is np.dot(X, X.T) which means that the cross-val can be done just with it > [...] OK, how about that: I try to solve only the situation for

Re: [Scikit-learn-general] GraphLasso pull request and feature

2011-11-09 Thread Alexandre Gramfort
indeed this is more correct... > - data / design matrix, shape (n_samples, n_features) > - kernel / Gram / similarity / affinity / connectivity, shape > (n_samples, n_samples) > - distance, shape (n_samples, n_samples) (same shape as kernel but > opposite semantics) > - covariance, shape (n_featur

Re: [Scikit-learn-general] GraphLasso pull request and feature

2011-11-09 Thread Olivier Grisel
2011/11/9 Alexandre Gramfort : > to complexify a bit the pb note that in the SVM/Lasso/... case the > precomputed gram > is np.dot(X, X.T) which means that the cross-val can be done just with it > while for the covariance estimation, like GraphLassoCV, the empirical > covariance is np.dot(X.T, X) h

Re: [Scikit-learn-general] GraphLasso pull request and feature

2011-11-09 Thread Alexandre Gramfort
to complexify a bit the pb note that in the SVM/Lasso/... case the precomputed gram is np.dot(X, X.T) which means that the cross-val can be done just with it while for the covariance estimation, like GraphLassoCV, the empirical covariance is np.dot(X.T, X) hence the fit needs X as input. so it see

Re: [Scikit-learn-general] GraphLasso pull request and feature

2011-11-09 Thread Gael Varoquaux
On Wed, Nov 09, 2011 at 11:43:40PM +0100, bthirion wrote: > > What do people think? Should I: > > 1. change graph_lasso to take the empirical covariance as an input > > 2. add an 'X_is_cov' parameter to the estimators > +1 for the second one. I actually was suggesting both, and 1 as a mean f

Re: [Scikit-learn-general] GraphLasso pull request and feature

2011-11-09 Thread bthirion
> What do people think? Should I: > > 1. change graph_lasso to take the empirical covariance as an input > > 2. add an 'X_is_cov' parameter to the estimators +1 for the second one. If we want to introduce some kind of automated guess of the regularization parameter, we'll have to know the dim

Re: [Scikit-learn-general] GraphLasso pull request and feature

2011-11-09 Thread josef . pktd
On Wed, Nov 9, 2011 at 12:20 PM, Virgile Fritsch wrote: > Reminds me of the PR by Robert about performing clustering from similarity > matrix or directly from the data. > So I would be in favour of having a X_is_cov keyword. > > Sorry for biasing the discussion with cov_init, I answered to quikly

Re: [Scikit-learn-general] GraphLasso pull request and feature

2011-11-09 Thread Virgile Fritsch
Reminds me of the PR by Robert about performing clustering from similarity matrix or directly from the data. So I would be in favour of having a X_is_cov keyword. Sorry for biasing the discussion with cov_init, I answered to quikly ;) On Wed, Nov 9, 2011 at 5:16 PM, Gael Varoquaux < gael.varoqu..

Re: [Scikit-learn-general] GraphLasso pull request and feature

2011-11-09 Thread Gael Varoquaux
On Wed, Nov 09, 2011 at 10:05:53AM -0500, [email protected] wrote: > graph_lasso(X,) takes the data array as an argument, but except > calculating the empirical_covariance at the beginning X is not used > anymore, as far as I could see. > The algorithm looks very interesting, but I would ha

Re: [Scikit-learn-general] GraphLasso pull request and feature

2011-11-09 Thread josef . pktd
On Wed, Nov 9, 2011 at 10:59 AM, Lars Buitinck wrote: > 2011/11/9 Virgile Fritsch : >> Did you notice the `cov_init` parameter?, or maybe it was added after your >> comment? > > OOPS, sorry. In my reading of the code cov_init is just the starting matrix, the updating is still based on emp_cov. J

Re: [Scikit-learn-general] GraphLasso pull request and feature

2011-11-09 Thread Lars Buitinck
2011/11/9 Virgile Fritsch : > Did you notice the `cov_init` parameter?, or maybe it was added after your > comment? OOPS, sorry. -- Lars Buitinck Scientific programmer, ILPS University of Amsterdam -- RSA(R) Conference

Re: [Scikit-learn-general] GraphLasso pull request and feature

2011-11-09 Thread Virgile Fritsch
Did you notice the `cov_init` parameter?, or maybe it was added after your comment? On Wed, Nov 9, 2011 at 4:13 PM, Lars Buitinck wrote: > 2011/11/9 : > > The graph looks very good, good advertising. > > Yep, pretty picture ;) > > > graph_lasso(X,) takes the data array as an argument, but

Re: [Scikit-learn-general] GraphLasso pull request and feature

2011-11-09 Thread Lars Buitinck
2011/11/9 : > The graph looks very good, good advertising. Yep, pretty picture ;) > graph_lasso(X,) takes the data array as an argument, but except > calculating the empirical_covariance at the beginning X is not used > anymore, as far as I could see. > > The algorithm looks very interesting

Re: [Scikit-learn-general] GraphLasso pull request and feature

2011-11-09 Thread josef . pktd
On Wed, Nov 9, 2011 at 6:21 AM, Gael Varoquaux wrote: > Hi list, > > I'd like to ask for comments on the GraphLasso pull request that I have > put in. I think that it is ready for merge, even though it has been in > development for a short amount of time, because I have been working on > similar a

[Scikit-learn-general] GraphLasso pull request and feature

2011-11-09 Thread Gael Varoquaux
Hi list, I'd like to ask for comments on the GraphLasso pull request that I have put in. I think that it is ready for merge, even though it has been in development for a short amount of time, because I have been working on similar algorithms for more than two years. To give you a run-through, and