Re: [Scikit-learn-general] Merging in label propagation

Mathieu Blondel Thu, 02 Feb 2012 03:30:43 -0800

On Thu, Feb 2, 2012 at 8:15 PM, Olivier Grisel <[email protected]> wrote:


> I wonder which representation is the nicest for the end user? It might
> be the case that keeping the unlabeled data as a separate variable
> might be more natural but that will probably impact pipeline-ability
> and cross-validation since X_unlabeld.shape[0] won't be the same as
> X_labeled.shape[0] and y_labeled.shaped[0].

cross-validation will probably break any way as the unlabeled examples
cannot be used in the test set. This is also shows that we should
probably have a library-wide default encoding for unlabeled data (this
way, we will be able to make sure that all the unlabeled data goes to
the training set).

Keeping the label propagation and semi-supervised NB PRs on hold
forever doesn't help. We should merge them and keep in mind that their
API is a work-in-progress.

Mathieu

------------------------------------------------------------------------------
Keep Your Developer Skills Current with LearnDevNow!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-d2d
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Re: [Scikit-learn-general] Merging in label propagation

Reply via email to