Hi Oliver,

your reply is very informative (as always :-) ).
I've got a couple of question for you. See below...

On Tue, Jan 24, 2012 at 1:57 PM, Olivier Grisel <[email protected]>wrote:
>
> If you can cheaply collect unsupervised data that looks similar to
> your training set (albeit without the labels and in much larger
> amount) it might be interesting to compute cluster centers using
> MinibatchKMeans and then project your data on the space using a non

linear transform (e.g. a RBF kernel) and add this additional features
> to the original features (horizontal concatenation of the 2 datasets)
> and then fit the classifier with the labels on this.
>

Once you have clustered the unlabeled samples,
you can add, as extra features on the labeled samples,
the distance from each cluster center (e.g. computed
via RBF kernel). Is that what you are suggesting?

Is that effective? Can you point to any paper discussing
the effectiveness of the approach?

I've never had a chance to master semi-supervised learning...
Any pointer from where to start is really appreciated.

Thanks!

Paolo
------------------------------------------------------------------------------
Keep Your Developer Skills Current with LearnDevNow!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-d2d
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to