> Considering the final score, e.g., accuracy, does this mean that with scaling > and without I will get different results for NB and KNN?
Yes. I think it would really help you to read a little bit about how those algorithms work -- to develop an intuition how feature scaling affects the outcome, and why it doesn't matter in decision trees. > With gradient descent algorithms it is clear why I need to scale the features > (because as you wrote for convergence). The question is whether there are > similar reasons to scale features for other algorithms (like I said, KNN, NB > or SVM)? About SVM & feature scaling: it is roughly speaking the the same as e.g., Logistic regression (in the linear case) but minimizing a different cost function (hinge loss). I have an example here for Adaline (adaptive linear neurons) to illustrate the effect of standardization a little bit if it helps: http://sebastianraschka.com/Articles/2015_singlelayer_neurons.html#The-Gradient-Descent-Rule-in-Action Lastly, there is not a general rule such that feature scaling *always* "improves" the predictive performance. You really need to thing about it in context of the problem that you want to solve and the model that you are going to use. > On Jun 5, 2015, at 2:19 PM, Yury Zhauniarovich <y.zhalnerov...@gmail.com> > wrote: > > Thank you, Sebastian. This is what I want to understand. Considering the > final score, e.g., accuracy, does this mean that with scaling and without I > will get different results for NB and KNN? Or results will be the same like > in case of decision trees? > > With gradient descent algorithms it is clear why I need to scale the features > (because as you wrote for convergence). The question is whether there are > similar reasons to scale features for other algorithms (like I said, KNN, NB > or SVM)? May I get different results (e.g., accuracy) if I scale features or > not? > > > Best Regards, > Yury Zhauniarovich > > On 5 June 2015 at 19:58, Sebastian Raschka <se.rasc...@gmail.com > <mailto:se.rasc...@gmail.com>> wrote: > "Need" to be scaled sounds a little bit strong ;) -- feature scaling is > really context-dependend. If you are using stochastic gradient descent of > gradient descent you surely want to standardize your data or at least center > it for technical reasons and convergence. However, in naive Bayes, you just > estimate the parameters e.g., via MLE so that there is no technical advantage > of feature scaling, however, the results will be different with and without > scaling. > >> On Jun 5, 2015, at 1:03 PM, Andreas Mueller <t3k...@gmail.com >> <mailto:t3k...@gmail.com>> wrote: >> >> The result of scaled an non-scaled data will be different because the >> regularization will have a different effect. >> >> On 06/05/2015 03:10 AM, Yury Zhauniarovich wrote: >>> Thank you all! However, what Sturla wrote is now out of my understanding. >>> >>> One more question. It seems also to me that Naive Bayes classifiers also do >>> not need data to be scaled. Am I correct? >>> >>> >>> Best Regards, >>> Yury Zhauniarovich >>> >>> On 4 June 2015 at 20:55, Sturla Molden <sturla.mol...@gmail.com >>> <mailto:sturla.mol...@gmail.com>> wrote: >>> On 04/06/15 20:38, Sturla Molden wrote: >>> >>> > Component-wise EM (aka CEM2) is a better way of avoiding the singularity >>> > disease, though. >>> >>> The traditional EM for a GMM proceeds like this: >>> >>> while True: >>> >>> global_estep(clusters) >>> >>> for c in clusters: >>> mstep(c) >>> >>> This is inherently unstable. Several clusters can become >>> near-singular in the M-step before there is an E-step >>> to redistribute the weights. You can get a "cascade of >>> singularities" where the whole GMM basically dies. Even >>> if you bias the diagonal of the covariance you still >>> have the basic algorithmic problem. >>> >>> CEM2 proceeds like this: >>> >>> while True: >>> for c in clusters: >>> estep(c) >>> mstep(c) >>> >>> This improves stability enormously. When a cluster becomes >>> singular, the memberships are immediately redistributed. >>> Therefore you will not get a "cascade of singularities" >>> where the whole GMM basically dies. >>> >>> >>> Sturla >>> >>> >>> ------------------------------------------------------------------------------ >>> _______________________________________________ >>> Scikit-learn-general mailing list >>> Scikit-learn-general@lists.sourceforge.net >>> <mailto:Scikit-learn-general@lists.sourceforge.net> >>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general >>> <https://lists.sourceforge.net/lists/listinfo/scikit-learn-general> >>> >>> >>> >>> ------------------------------------------------------------------------------ >>> >>> >>> _______________________________________________ >>> Scikit-learn-general mailing list >>> Scikit-learn-general@lists.sourceforge.net >>> <mailto:Scikit-learn-general@lists.sourceforge.net> >>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general >>> <https://lists.sourceforge.net/lists/listinfo/scikit-learn-general> >> >> ------------------------------------------------------------------------------ >> _______________________________________________ >> Scikit-learn-general mailing list >> Scikit-learn-general@lists.sourceforge.net >> <mailto:Scikit-learn-general@lists.sourceforge.net> >> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general >> <https://lists.sourceforge.net/lists/listinfo/scikit-learn-general> > > > ------------------------------------------------------------------------------ > > _______________________________________________ > Scikit-learn-general mailing list > Scikit-learn-general@lists.sourceforge.net > <mailto:Scikit-learn-general@lists.sourceforge.net> > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general > <https://lists.sourceforge.net/lists/listinfo/scikit-learn-general> > > > ------------------------------------------------------------------------------ > _______________________________________________ > Scikit-learn-general mailing list > Scikit-learn-general@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
_______________________________________________ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general