On 02/03/2014 11:01 AM, Lars Buitinck wrote: > 2014-02-02 Andy <t3k...@gmail.com>: >> Now, with respect to sinning: there is really no additional information >> in the labels that could be used during learning. > Actually there is: the presence of classes outside the training set > affects probability distributions. Lidstone-smoothed multinomial and > Bernoulli naive Bayes, as well as all (other) variants of > logistic/softmax regression, never output zero probabilities, so they > must assign some fraction of probability mass to the unseen classes > (without affecting the Bayes optimal decision, so predict output is > unchanged). For closed-form and zero-initialized models, the > distribution over the unseen classes will be uniform, but I'm not sure > how neural nets will fare, since those are initialized randomly. >
I agree that the probability predicted by the model for an unseen class would not be zero. I don't think that is what I wanted to claim. I think what I tried to say was that the model will essentially be the same. There will be additional columns of zeros and some bias term that depends on the regularization. [I think this will even be the case for the neural net] ------------------------------------------------------------------------------ Managing the Performance of Cloud-Based Applications Take advantage of what the Cloud has to offer - Avoid Common Pitfalls. Read the Whitepaper. http://pubads.g.doubleclick.net/gampad/clk?id=121051231&iu=/4140/ostg.clktrk _______________________________________________ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general