"Need" to be scaled sounds a little bit strong ;) -- feature scaling is really context-dependend. If you are using stochastic gradient descent of gradient descent you surely want to standardize your data or at least center it for technical reasons and convergence. However, in naive Bayes, you just estimate the parameters e.g., via MLE so that there is no technical advantage of feature scaling, however, the results will be different with and without scaling.
> On Jun 5, 2015, at 1:03 PM, Andreas Mueller <t3k...@gmail.com> wrote: > > The result of scaled an non-scaled data will be different because the > regularization will have a different effect. > > On 06/05/2015 03:10 AM, Yury Zhauniarovich wrote: >> Thank you all! However, what Sturla wrote is now out of my understanding. >> >> One more question. It seems also to me that Naive Bayes classifiers also do >> not need data to be scaled. Am I correct? >> >> >> Best Regards, >> Yury Zhauniarovich >> >> On 4 June 2015 at 20:55, Sturla Molden <sturla.mol...@gmail.com >> <mailto:sturla.mol...@gmail.com>> wrote: >> On 04/06/15 20:38, Sturla Molden wrote: >> >> > Component-wise EM (aka CEM2) is a better way of avoiding the singularity >> > disease, though. >> >> The traditional EM for a GMM proceeds like this: >> >> while True: >> >> global_estep(clusters) >> >> for c in clusters: >> mstep(c) >> >> This is inherently unstable. Several clusters can become >> near-singular in the M-step before there is an E-step >> to redistribute the weights. You can get a "cascade of >> singularities" where the whole GMM basically dies. Even >> if you bias the diagonal of the covariance you still >> have the basic algorithmic problem. >> >> CEM2 proceeds like this: >> >> while True: >> for c in clusters: >> estep(c) >> mstep(c) >> >> This improves stability enormously. When a cluster becomes >> singular, the memberships are immediately redistributed. >> Therefore you will not get a "cascade of singularities" >> where the whole GMM basically dies. >> >> >> Sturla >> >> >> ------------------------------------------------------------------------------ >> _______________________________________________ >> Scikit-learn-general mailing list >> Scikit-learn-general@lists.sourceforge.net >> <mailto:Scikit-learn-general@lists.sourceforge.net> >> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general >> <https://lists.sourceforge.net/lists/listinfo/scikit-learn-general> >> >> >> >> ------------------------------------------------------------------------------ >> >> >> _______________________________________________ >> Scikit-learn-general mailing list >> Scikit-learn-general@lists.sourceforge.net >> <mailto:Scikit-learn-general@lists.sourceforge.net> >> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general >> <https://lists.sourceforge.net/lists/listinfo/scikit-learn-general> > > ------------------------------------------------------------------------------ > _______________________________________________ > Scikit-learn-general mailing list > Scikit-learn-general@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
_______________________________________________ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general