Re: [Scikit-learn-general] Naive Bayes Sparse Implementation

2012-01-19 Thread Lars Buitinck
2012/1/19 ert > I wanted to train a multinomialnb classifier on a training set containing > 18k features each containing about 1.8M examples .. Unfortunately I do not > have enough memory ( I have 4G) on my system to create such an array...Is > there a sparse implementation of Naive Bayes which c

[Scikit-learn-general] Naive Bayes Sparse Implementation

2012-01-19 Thread ert
Hi, I wanted to train a multinomialnb classifier on a training set containing 18k features each containing about 1.8M examples .. Unfortunately I do not have enough memory ( I have 4G) on my system to create such an array...Is there a sparse implementation of Naive Bayes which can be used? or i

Re: [Scikit-learn-general] Sparse Matrices and Classifiers

2012-01-19 Thread Lars Buitinck
2012/1/19 Kenneth C. Arnold : > As an aside to those who use scipy's sparse matrices: do you find it > troublesome that scipy's sparse things behave like matrices instead of > like ndarrays? If dense matrices are a thin wrapper around dense > ndarrays, shouldn't sparse matrices be a thin wrapper ar

Re: [Scikit-learn-general] [off-topic] scipy sparse library alternatives

2012-01-19 Thread Lars Buitinck
2012/1/19 Kenneth C. Arnold : > Divisi[1] uses PySparse[2,3]. > > [1] https://github.com/commonsense/divisi2 > [2] http://pysparse.sourceforge.net/ > [3] https://github.com/rspeer/csc-pysparse No first hand experience here, but I believe the (deprecated) scipy.maxent supports that as well. -- La

Re: [Scikit-learn-general] [off-topic] scipy sparse library alternatives

2012-01-19 Thread Kenneth C. Arnold
On Thu, Jan 19, 2012 at 11:03 AM, Satrajit Ghosh wrote: > in one of my projects i use the scipy sparse library for turning a graph > into a sparse dependency matrix and then manipulating this matrix > (adding/subtracting columns/rows, setting elements to 0, ...). this is the > only reason i have s

[Scikit-learn-general] [off-topic] scipy sparse library alternatives

2012-01-19 Thread Satrajit Ghosh
hi all, in one of my projects i use the scipy sparse library for turning a graph into a sparse dependency matrix and then manipulating this matrix (adding/subtracting columns/rows, setting elements to 0, ...). this is the only reason i have scipy as a dependency and would like to avoid it. are the

Re: [Scikit-learn-general] Sparse Matrices and Classifiers

2012-01-19 Thread Alexandre Passos
On Thu, Jan 19, 2012 at 10:48, Kenneth C. Arnold wrote: > On Thu, Jan 19, 2012 at 3:05 AM, Olivier Grisel > wrote: >> Rather than improving the error message when passing sparse arrays to >> the dense impl of SVC we should refactor SVC to accept both dense and >> sparse representation and use the

Re: [Scikit-learn-general] Sparse Matrices and Classifiers

2012-01-19 Thread Kenneth C. Arnold
On Thu, Jan 19, 2012 at 3:05 AM, Olivier Grisel wrote: > Rather than improving the error message when passing sparse arrays to > the dense impl of SVC we should refactor SVC to accept both dense and > sparse representation and use the right wrapper as already done for > SGD, LinearSVC, LogisticReg

Re: [Scikit-learn-general] Sparse Matrices and Classifiers

2012-01-19 Thread Lars Buitinck
2012/1/19 Andreas : > I'll gladly review your pull request ;) +1 :) -- Lars Buitinck Scientific programmer, ILPS University of Amsterdam -- Keep Your Developer Skills Current with LearnDevNow! The most comprehensive onl

Re: [Scikit-learn-general] Sparse Matrices and Classifiers

2012-01-19 Thread Andreas
On 01/19/2012 09:05 AM, Olivier Grisel wrote: > 2012/1/19 Gael Varoquaux: > >> On Thu, Jan 19, 2012 at 12:13:38PM +0900, Mathieu Blondel wrote: >> >>> Since your data is sparse, you need to use svm.sparse.SVC, not svm.SVC. >>> >> Those error messages are really not enlightning. Ma

Re: [Scikit-learn-general] warm-start pull-request

2012-01-19 Thread Mathieu Blondel
2012/1/19 Stéfan van der Walt : > Talking of which, I see in the current docs that coefficients can be > specified to initialise, e.g., Lasso [1], but in the development > version that is no longer possible.  What is the new suggested way of > doing warm starts? Warm start is implemented in the d

Re: [Scikit-learn-general] warm-start pull-request

2012-01-19 Thread Stéfan van der Walt
On Thu, Jan 19, 2012 at 12:44 AM, Mathieu Blondel wrote: > Here's a a pull-request implementing more convenient warm-start in SGD > and ElasticNet: > > https://github.com/scikit-learn/scikit-learn/pull/568 Talking of which, I see in the current docs that coefficients can be specified to initialis

[Scikit-learn-general] warm-start pull-request

2012-01-19 Thread Mathieu Blondel
Here's a a pull-request implementing more convenient warm-start in SGD and ElasticNet: https://github.com/scikit-learn/scikit-learn/pull/568 Comments welcome! Mathieu -- Keep Your Developer Skills Current with LearnDevN

Re: [Scikit-learn-general] Factorial analysis

2012-01-19 Thread Gael Varoquaux
To be pragmatic I would: go for MDP. Gael - Original message - > > > > I don't have the Bishop, and I must confess that I am still confused by > > the Wikipedia. That said, it doesn't really matter. As long as people > > feel confident that it is well defined and useful, it belongs to th

Re: [Scikit-learn-general] Factorial analysis

2012-01-19 Thread Olivier Grisel
2012/1/19 Joris A. : > > In the meantime, any good libs to recommend? MDP? Yes have a look at MDP and maybe also scikits.statsmodels which is focused more classical statistics for finance and economics than scikit-learn. -- Olivier http://twitter.com/ogrisel - http://github.com/ogrisel

[Scikit-learn-general] Factorial analysis

2012-01-19 Thread Joris A.
> > I don't have the Bishop, and I must confess that I am still confused by > the Wikipedia. That said, it doesn't really matter. As long as people > feel confident that it is well defined and useful, it belongs to the > scikit, and I am all for it :). > > Gael > > > Thank you all! Let's hope it w

Re: [Scikit-learn-general] Factorial analysis

2012-01-19 Thread Olivier Grisel
2012/1/19 Alexandre Gramfort : > +1 for FA. It's standard and indeed very similar to ProbabilisticPCA > that we have. > > Now we need a volunteer :) Is there a way to implement it in a scalable way (w.r.t n_samples and n_features and n_factors / n_components)? Because if it fallbacks to the defau

Re: [Scikit-learn-general] GSoC 2012

2012-01-19 Thread Vincent Michel
Hi list, I'm more than +1 for online learning, it could be a killing feature of the scikit ! I also like the first suggestion of Andreas, about Multinomial Logistic regression. I think there is interesting work to do in the junction with Bayesian statistics and priors. Vincent 2012/1/19 Alex

Re: [Scikit-learn-general] Factorial analysis

2012-01-19 Thread Alexandre Gramfort
+1 for FA. It's standard and indeed very similar to ProbabilisticPCA that we have. Now we need a volunteer :) Alex On Thu, Jan 19, 2012 at 7:31 AM, Gael Varoquaux wrote: > On Wed, Jan 18, 2012 at 11:53:01PM +0100, Andreas wrote: >> > Factor analysis is a decomposition with a particular >> > ass

Re: [Scikit-learn-general] Sparse Matrices and Classifiers

2012-01-19 Thread Olivier Grisel
2012/1/19 Gael Varoquaux : > On Thu, Jan 19, 2012 at 12:13:38PM +0900, Mathieu Blondel wrote: >> Since your data is sparse, you need to use svm.sparse.SVC, not svm.SVC. > > Those error messages are really not enlightning. Mathieu, you were saying > in the thread about GSOC that sparse functionality

Re: [Scikit-learn-general] GSoC 2012

2012-01-19 Thread Alexandre Gramfort
i've created the wiki page to organize what was suggested and so people can volunteer for mentoring. https://github.com/scikit-learn/scikit-learn/wiki/A-list-of-topics-for-a-google-summer-of-code-%28gsoc%29-2012 Alex On Thu, Jan 19, 2012 at 8:38 AM, Peter Prettenhofer wrote: [..] > - S