On 13 March 2012 09:42, Olivier Grisel <[email protected]> wrote:

> Le 11 mars 2012 20:35, Robert Layton <[email protected]> a écrit :
> > Hi All,
> >
> > On reading some research, it appears that the shrunken centroid
> classifier
> > is one of the better methods for authorship analysis.
> > Therefore, I'm going to implement it at see if it really is, and I was
> > planning to add it to scikits.learn.
> >
> > Before I start, I wanted to make sure it wasn't already in scikits.learn
> > under a different name (as I don't do much classification, I am not
> sure).
> > The method is basically like k-means clustering:
> > training: each class is represented by its centroid
> > testing: instances are assigned to the nearest centroid.
>
> I have it in a branch:
>
> https://github.com/ogrisel/scikit-learn/tree/nearest-centroid
>
> There is no tests, no doc. It works quite good on the olivetti faces
> but very badly on the text data 20 newsgroups which is kind of
> unexpected as kmeans is able to cluster the text data quite well. That
> was kind of unexpected, investigating why it's bad on high dim sparse
> data my help understand better the nature of text data.
>
> --
> Olivier
> http://twitter.com/ogrisel - http://github.com/ogrisel
>
>
> ------------------------------------------------------------------------------
> Try before you buy = See our experts in action!
> The most comprehensive online learning library for Microsoft developers
> is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
> Metro Style Apps, more. Free future releases when you subscribe now!
> http://p.sf.net/sfu/learndevnow-dev2
> _______________________________________________
> Scikit-learn-general mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>


Thanks Oliver,

I'll work off that template, and when I work out the details of the
shrinking parameters (specifically which one is more in use), I'll branch
and submit a PR.

- Robert

-- 

Public key at: http://pgp.mit.edu/ Search for this email address and select
the key from "2011-08-19" (key id: 54BA8735)
------------------------------------------------------------------------------
Keep Your Developer Skills Current with LearnDevNow!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-d2d
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to