2014-02-02 Andy <t3k...@gmail.com>: > Now, with respect to sinning: there is really no additional information > in the labels that could be used during learning.
Actually there is: the presence of classes outside the training set affects probability distributions. Lidstone-smoothed multinomial and Bernoulli naive Bayes, as well as all (other) variants of logistic/softmax regression, never output zero probabilities, so they must assign some fraction of probability mass to the unseen classes (without affecting the Bayes optimal decision, so predict output is unchanged). For closed-form and zero-initialized models, the distribution over the unseen classes will be uniform, but I'm not sure how neural nets will fare, since those are initialized randomly. > The only case when that could > be important is if the labels have some meaningful labeling and it is > important to know the position of the labels with respect to the > previous ones. > But that is somewhat of a weird thing to encode here anyhow. ... because we don't support sequence/structured stuff. Is the original problem related to evaluation? ------------------------------------------------------------------------------ Managing the Performance of Cloud-Based Applications Take advantage of what the Cloud has to offer - Avoid Common Pitfalls. Read the Whitepaper. http://pubads.g.doubleclick.net/gampad/clk?id=121051231&iu=/4140/ostg.clktrk _______________________________________________ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general