Re: [Scikit-learn-general] Shrunken Centroid Classifier

2012-03-12 Thread Olivier Grisel
Le 12 mars 2012 17:49, Robert Layton a écrit : > > I'll work off that template, and when I work out the details of the > shrinking parameters (specifically which one is more in use), I'll branch > and submit a PR. Great. I think the nearest centroid is a very nice baseline classifier for sanity c

Re: [Scikit-learn-general] Shrunken Centroid Classifier

2012-03-12 Thread Robert Layton
On 13 March 2012 09:42, Olivier Grisel wrote: > Le 11 mars 2012 20:35, Robert Layton a écrit : > > Hi All, > > > > On reading some research, it appears that the shrunken centroid > classifier > > is one of the better methods for authorship analysis. > > Therefore, I'm going to implement it at se

Re: [Scikit-learn-general] Shrunken Centroid Classifier

2012-03-12 Thread Olivier Grisel
Le 11 mars 2012 20:35, Robert Layton a écrit : > Hi All, > > On reading some research, it appears that the shrunken centroid classifier > is one of the better methods for authorship analysis. > Therefore, I'm going to implement it at see if it really is, and I was > planning to add it to scikits.l

Re: [Scikit-learn-general] HMM Documentation and Development

2012-03-12 Thread Gael Varoquaux
On Wed, Mar 07, 2012 at 09:48:32AM +0100, Gael Varoquaux wrote: > There is a pull request that improves a lot the HMM implementation and > documentation: > https://github.com/scikit-learn/scikit-learn/pull/538 > It should be merged anytime. It is merged :). Speed of HMMs can probably be further

Re: [Scikit-learn-general] GSoc Idea

2012-03-12 Thread Lars Buitinck
2012/3/12 Vikram Kamath : > 1. Splits in CART are restricted to binary splits (a C4.5/C5.0 D-Tree is > m-ary) All our learners work on numeric data, meaning categorical data must be split into binary features according to a one-of-K representation prior to handing it to a learner. So unless you

Re: [Scikit-learn-general] GSoc Idea

2012-03-12 Thread Vikram Kamath
Hi, This is in response to Peter and Adreas' queries about the differences between CART and C4.5/C5.0 1. Splits in CART are restricted to binary splits (a C4.5/C5.0 D-Tree is m-ary) 2. Differences between C4.5/C5.0 and CART include differences in: a. splitting criteria b. the pruning meth

Re: [Scikit-learn-general] Shrunken Centroid Classifier

2012-03-12 Thread Robert Layton
On 12 March 2012 19:30, Andreas wrote: > ** > Hi Robert. > To me, this sounds somwhat like Linear Discriminant Analysis or rather > Quadratic Discriminant Analysis (without the shrinking part) to me. > > In these methods, a Gaussian is fitted to each class and classification > is done by finding

Re: [Scikit-learn-general] Shrunken Centroid Classifier

2012-03-12 Thread Andreas
Hi Robert. To me, this sounds somwhat like Linear Discriminant Analysis or rather Quadratic Discriminant Analysis (without the shrinking part) to me. In these methods, a Gaussian is fitted to each class and classification is done by finding the Gaussian that most likely created a data point. Thi