Re: [Scikit-learn-general] Specify rather than learn sparse coding dictionary?

2011-12-05 Thread Vlad Niculae
On Tue, Dec 6, 2011 at 5:57 AM, Gael Varoquaux wrote: > On Mon, Dec 05, 2011 at 11:21:01PM +0100, Andreas Mueller wrote: >> What you do want is to "transform" the new data so that it >> is coded using the specified dictionary. >> I think this is exactly what the sparse encoding method that Olivier

Re: [Scikit-learn-general] Hyperparameter optimization

2011-12-05 Thread Gael Varoquaux
On Mon, Dec 05, 2011 at 01:41:53PM -0500, Alexandre Passos wrote: > On Mon, Dec 5, 2011 at 13:31, James Bergstra wrote: > > I should probably not have scared ppl off speaking of a 250-job > > budget.  My intuition would be that with 2-8 hyper-parameters, and 1-3 > > "significant" hyper-parameters,

Re: [Scikit-learn-general] Specify rather than learn sparse coding dictionary?

2011-12-05 Thread Gael Varoquaux
On Mon, Dec 05, 2011 at 11:21:01PM +0100, Andreas Mueller wrote: > What you do want is to "transform" the new data so that it > is coded using the specified dictionary. > I think this is exactly what the sparse encoding method that Olivier > referenced is doing. We would need an intermediate objec

Re: [Scikit-learn-general] Memory consumption of LinearSVC.fit

2011-12-05 Thread Gael Varoquaux
On Mon, Dec 05, 2011 at 10:54:42PM +0100, Olivier Grisel wrote: > - libsvm uses SMO (a dual solver) and supports non-linear kernels and > has complexity ~ n_samples^3 hence cannot scale to large n_samples > (e.g. more than 50k). > - liblinear uses some kind of fancy coordinate descent (primal or du

Re: [Scikit-learn-general] Specify rather than learn sparse coding dictionary?

2011-12-05 Thread David Warde-Farley
On 2011-12-05, at 5:50 PM, Ian Goodfellow wrote: > > I think I was mostly confused by the terminology-- I don't consider the code > to be part of a sparse coding model, nor to be estimated (I am aware that > sparse coding involves iterative optimization but I don't consider the > optimizer > to b

Re: [Scikit-learn-general] Specify rather than learn sparse coding dictionary?

2011-12-05 Thread Ian Goodfellow
On Mon, Dec 5, 2011 at 5:31 PM, Olivier Grisel wrote: > 2011/12/5 Andreas Mueller : >> On 12/05/2011 11:14 PM, Alexandre Gramfort wrote: I do not understand. I have the dictionary already, so what is being estimated? >>> well I am not sure to follow now, but if you have the dictionary t

Re: [Scikit-learn-general] Specify rather than learn sparse coding dictionary?

2011-12-05 Thread Olivier Grisel
2011/12/5 Andreas Mueller : > On 12/05/2011 11:14 PM, Alexandre Gramfort wrote: >>> I do not understand. I have the dictionary already, so what is being >>> estimated? >> well I am not sure to follow now, but if you have the dictionary the >> only missing part is the coefs of the decomposition. >>

Re: [Scikit-learn-general] Specify rather than learn sparse coding dictionary?

2011-12-05 Thread Andreas Mueller
On 12/05/2011 11:14 PM, Alexandre Gramfort wrote: >> I do not understand. I have the dictionary already, so what is being >> estimated? > well I am not sure to follow now, but if you have the dictionary the > only missing part is the coefs of the decomposition. > > X = dico x coefs I think there i

Re: [Scikit-learn-general] Specify rather than learn sparse coding dictionary?

2011-12-05 Thread Alexandre Gramfort
> I do not understand. I have the dictionary already, so what is being > estimated? well I am not sure to follow now, but if you have the dictionary the only missing part is the coefs of the decomposition. X = dico x coefs Alex --

Re: [Scikit-learn-general] Specify rather than learn sparse coding dictionary?

2011-12-05 Thread Olivier Grisel
2011/12/5 Ian Goodfellow : > On Mon, Dec 5, 2011 at 4:50 PM, Alexandre Gramfort > wrote: >>> One experiment I want to do involves plugging in dictionaries that >>> were learned with other methods. >> >> if you have the dictionaries then use a batch lasso or batch OMP with >> precomputed gram to ge

Re: [Scikit-learn-general] Specify rather than learn sparse coding dictionary?

2011-12-05 Thread Ian Goodfellow
On Mon, Dec 5, 2011 at 4:50 PM, Alexandre Gramfort wrote: >> One experiment I want to do involves plugging in dictionaries that >> were learned with other methods. > > if you have the dictionaries then use a batch lasso or batch OMP with > precomputed gram to get the coefficients. That will give y

Re: [Scikit-learn-general] Hyperparameter optimization

2011-12-05 Thread Olivier Grisel
2011/12/5 Alexandre Passos : > On Mon, Dec 5, 2011 at 16:26, James Bergstra wrote: >> >> This is definitely a good idea. I think randomly sampling is still >> useful though. It is not hard to get into settings where the grid is >> in theory very large and the user has a budget that is a tiny fract

Re: [Scikit-learn-general] Specify rather than learn sparse coding dictionary?

2011-12-05 Thread Olivier Grisel
2011/12/5 Alexandre Gramfort : >> One experiment I want to do involves plugging in dictionaries that >> were learned with other methods. > > if you have the dictionaries then use a batch lasso or batch OMP with > precomputed gram to get the coefficients. That will give you the full > estimated mode

Re: [Scikit-learn-general] Memory consumption of LinearSVC.fit

2011-12-05 Thread Olivier Grisel
2011/12/5 Alexandre Gramfort : > look at > > sklearn.multiclass Indeed, these tools allows the user to build a meta learner with any multiclass logic on top of a binary classifier implementations (hence both LinearSVC and SVC can be used as the underlying binary classifier implementations). htt

Re: [Scikit-learn-general] Memory consumption of LinearSVC.fit

2011-12-05 Thread Olivier Grisel
2011/12/5 Ian Goodfellow : > > ok, I was using LinearSVC, so I guess I am still not using the dense > implementation. > > Is there a way to use one-against-rest rather than one-against-many > classification with the SVC class? What is one-against-many? SVC mutliclass support comes directly from th

Re: [Scikit-learn-general] Specify rather than learn sparse coding dictionary?

2011-12-05 Thread Alexandre Gramfort
> One experiment I want to do involves plugging in dictionaries that > were learned with other methods. if you have the dictionaries then use a batch lasso or batch OMP with precomputed gram to get the coefficients. That will give you the full estimated model. Alex --

Re: [Scikit-learn-general] Memory consumption of LinearSVC.fit

2011-12-05 Thread Alexandre Gramfort
look at sklearn.multiclass Alex On Mon, Dec 5, 2011 at 10:37 PM, Ian Goodfellow wrote: > On Mon, Dec 5, 2011 at 4:24 PM, Olivier Grisel > wrote: >> 2011/12/5 Ian Goodfellow : >>> On Fri, Dec 2, 2011 at 3:36 AM, Olivier Grisel >>> wrote: 2011/12/2 Ian Goodfellow : > On Fri, Oct 7, 2

[Scikit-learn-general] Specify rather than learn sparse coding dictionary?

2011-12-05 Thread Ian Goodfellow
I'm interested in doing sparse coding with scikits.learn. It looks like the way to do this is with sklearn.decomposition.MiniBatchDictionaryLearning. Am I correct about that? If so: One experiment I want to do involves plugging in dictionaries that were learned with other methods. I thought I coul

Re: [Scikit-learn-general] Hyperparameter optimization

2011-12-05 Thread Alexandre Passos
On Mon, Dec 5, 2011 at 16:26, James Bergstra wrote: > > This is definitely a good idea. I think randomly sampling is still > useful though. It is not hard to get into settings where the grid is > in theory very large and the user has a budget that is a tiny fraction > of the full grid. I'd like t

Re: [Scikit-learn-general] Memory consumption of LinearSVC.fit

2011-12-05 Thread Ian Goodfellow
On Mon, Dec 5, 2011 at 4:24 PM, Olivier Grisel wrote: > 2011/12/5 Ian Goodfellow : >> On Fri, Dec 2, 2011 at 3:36 AM, Olivier Grisel >> wrote: >>> 2011/12/2 Ian Goodfellow : On Fri, Oct 7, 2011 at 5:14 AM, Olivier Grisel wrote: > 2011/10/7 Ian Goodfellow : >> Thanks. Yes it d

Re: [Scikit-learn-general] Hyperparameter optimization

2011-12-05 Thread James Bergstra
On Mon, Dec 5, 2011 at 1:41 PM, Alexandre Passos wrote: > On Mon, Dec 5, 2011 at 13:31, James Bergstra wrote: >> I should probably not have scared ppl off speaking of a 250-job >> budget.  My intuition would be that with 2-8 hyper-parameters, and 1-3 >> "significant" hyper-parameters, randomly sa

Re: [Scikit-learn-general] Memory consumption of LinearSVC.fit

2011-12-05 Thread Olivier Grisel
2011/12/5 Ian Goodfellow : > On Fri, Dec 2, 2011 at 3:36 AM, Olivier Grisel > wrote: >> 2011/12/2 Ian Goodfellow : >>> On Fri, Oct 7, 2011 at 5:14 AM, Olivier Grisel >>> wrote: 2011/10/7 Ian Goodfellow : > Thanks. Yes it does appear that liblinear uses only a 64 bit dense format, >

Re: [Scikit-learn-general] ensemble not in setup.py

2011-12-05 Thread Alexandre Gramfort
add it to master Alex On Mon, Dec 5, 2011 at 10:16 PM, Satrajit Ghosh wrote: > hi fabian, > > 'ensemble' not in sklearn/setup.py. > > config.add_subpackage("ensemble") > config.add_subpackage("ensemble/tests") > > for something like this should i just add it in master? or send a pull > request.

[Scikit-learn-general] ensemble not in setup.py

2011-12-05 Thread Satrajit Ghosh
hi fabian, 'ensemble' not in sklearn/setup.py. config.add_subpackage("ensemble") config.add_subpackage("ensemble/tests") for something like this should i just add it in master? or send a pull request. cheers, satra --

Re: [Scikit-learn-general] Memory consumption of LinearSVC.fit

2011-12-05 Thread Alexandre Gramfort
hello ian, can you show a snippet of the code you use to train your svm? and give us the dimensions of your problem? Alex On Mon, Dec 5, 2011 at 9:51 PM, Ian Goodfellow wrote: > On Fri, Dec 2, 2011 at 3:36 AM, Olivier Grisel > wrote: >> 2011/12/2 Ian Goodfellow : >>> On Fri, Oct 7, 2011 at 5:

Re: [Scikit-learn-general] Memory consumption of LinearSVC.fit

2011-12-05 Thread Ian Goodfellow
On Fri, Dec 2, 2011 at 3:36 AM, Olivier Grisel wrote: > 2011/12/2 Ian Goodfellow : >> On Fri, Oct 7, 2011 at 5:14 AM, Olivier Grisel >> wrote: >>> 2011/10/7 Ian Goodfellow : Thanks. Yes it does appear that liblinear uses only a 64 bit dense format, so this memory usage is normal/caused

Re: [Scikit-learn-general] Hyperparameter optimization

2011-12-05 Thread Alexandre Passos
On Mon, Dec 5, 2011 at 13:44, Olivier Grisel wrote: > Yes. +1 for a pull request: one could just add a "budget" integer > argument (None by default) to the existing GridSearchCV class. Just did that, the pull request is at https://github.com/scikit-learn/scikit-learn/pull/455 So far no tests. Ho

[Scikit-learn-general] next release

2011-12-05 Thread Fabian Pedregosa
Dear scikit-learners, It's about time for a new release. This month of December is rather busy with the NIPS conference and the coding sprint happening [0] so I propose to make the release just after holidays, during the first weeks of January. That should give us enough time to test and stabilize

Re: [Scikit-learn-general] Hyperparameter optimization

2011-12-05 Thread Alexandre Passos
On Mon, Dec 5, 2011 at 14:19, Andreas Müller wrote: > on a related note: what about coarse to fine grid-searches? > For categorial variables, that doesn't make much sense but > I think it does for many of the numerical variables. Coarse-to-fine grid searches (where you expand search in regions ne

Re: [Scikit-learn-general] Hyperparameter optimization

2011-12-05 Thread Andreas Müller
On 12/05/2011 07:44 PM, Olivier Grisel wrote: > 2011/12/5 Alexandre Passos: >> On Mon, Dec 5, 2011 at 13:31, James Bergstra >> wrote: >>> I should probably not have scared ppl off speaking of a 250-job >>> budget. My intuition would be that with 2-8 hyper-parameters, and 1-3 >>> "significant" hy

Re: [Scikit-learn-general] Hyperparameter optimization

2011-12-05 Thread Olivier Grisel
2011/12/5 Alexandre Passos : > On Mon, Dec 5, 2011 at 13:31, James Bergstra wrote: >> I should probably not have scared ppl off speaking of a 250-job >> budget.  My intuition would be that with 2-8 hyper-parameters, and 1-3 >> "significant" hyper-parameters, randomly sampling around 10-30 points >

Re: [Scikit-learn-general] Hyperparameter optimization

2011-12-05 Thread Alexandre Passos
On Mon, Dec 5, 2011 at 13:31, James Bergstra wrote: > I should probably not have scared ppl off speaking of a 250-job > budget.  My intuition would be that with 2-8 hyper-parameters, and 1-3 > "significant" hyper-parameters, randomly sampling around 10-30 points > should be pretty reliable. So pe

Re: [Scikit-learn-general] Hyperparameter optimization

2011-12-05 Thread James Bergstra
I should probably not have scared ppl off speaking of a 250-job budget. My intuition would be that with 2-8 hyper-parameters, and 1-3 "significant" hyper-parameters, randomly sampling around 10-30 points should be pretty reliable. - James On Mon, Dec 5, 2011 at 1:28 PM, James Bergstra wrote: >

Re: [Scikit-learn-general] Hyperparameter optimization

2011-12-05 Thread James Bergstra
On Sat, Dec 3, 2011 at 6:32 AM, Olivier Grisel wrote: >> With regards to the random sampling, I am a bit worried that the results >> hold for a fair amount of points, and with a small amount of points >> (which is typically the situation in which many of us hide) it becomes >> very sensitive to th

Re: [Scikit-learn-general] Can linearSVM compute the confidence score of the class label ?

2011-12-05 Thread Peter Prettenhofer
If you don't need probabilities you could use `clf.decision_function(x)` to get the signed distance to the hyperplane which can also be used as a confidence score. best, Peter 2011/12/5 xinfan meng : > Cool, Thanks! > > > On Mon, Dec 5, 2011 at 8:12 PM, Mathieu Blondel > wrote: >> >> On Mon, De

Re: [Scikit-learn-general] Can linearSVM compute the confidence score of the class label ?

2011-12-05 Thread xinfan meng
Cool, Thanks! On Mon, Dec 5, 2011 at 8:12 PM, Mathieu Blondel wrote: > On Mon, Dec 5, 2011 at 9:05 PM, xinfan meng wrote: > > I understand that the LogisticRegression would be similar to LinearSVC in > > terms of performance. However, I am repeating other person's experiment. > > Still, Thank y

Re: [Scikit-learn-general] Can linearSVM compute the confidence score of the class label ?

2011-12-05 Thread Mathieu Blondel
On Mon, Dec 5, 2011 at 9:05 PM, xinfan meng wrote: > I understand that the LogisticRegression would be similar to LinearSVC in > terms of performance. However, I am  repeating other person's experiment. > Still, Thank you. Paolo Losi has some code that implements Platt's method (internally used b

Re: [Scikit-learn-general] Can linearSVM compute the confidence score of the class label ?

2011-12-05 Thread xinfan meng
I understand that the LogisticRegression would be similar to LinearSVC in terms of performance. However, I am repeating other person's experiment. Still, Thank you. On Mon, Dec 5, 2011 at 8:01 PM, Mathieu Blondel wrote: > On Mon, Dec 5, 2011 at 8:54 PM, xinfan meng wrote: > > Hi: > > I want

Re: [Scikit-learn-general] Can linearSVM compute the confidence score of the class label ?

2011-12-05 Thread Mathieu Blondel
On Mon, Dec 5, 2011 at 8:54 PM, xinfan meng wrote: > Hi: >     I want the classifier to output the class label and its confidence, in > order to use it in co-training. The predict_proba() in SVC classifier can > output a confidence. However, this classifier is a bit slow. Can I simulate > such con

[Scikit-learn-general] Can linearSVM compute the confidence score of the class label ?

2011-12-05 Thread xinfan meng
Hi: I want the classifier to output the class label and its confidence, in order to use it in co-training. The predict_proba() in SVC classifier can output a confidence. However, this classifier is a bit slow. Can I simulate such confidence score (not necessary a probability) with LinearSVM? Th