On 05/06/15 19:03, Andreas Mueller wrote:
> The result of scaled an non-scaled data will be different because the
> regularization will have a different effect.
Oh, so you regularize with lambda*I instead of lambda*diag(Sigma)?
Sturla
---
> Considering the final score, e.g., accuracy, does this mean that with scaling
> and without I will get different results for NB and KNN?
Yes. I think it would really help you to read a little bit about how those
algorithms work -- to develop an intuition how feature scaling affects the
outco
Yes, Andreas. Thank you, just wanted to clarify.
Thank you all for your help and sorry for some silly questions!
Best Regards,
Yury Zhauniarovich
On 5 June 2015 at 20:24, Andreas Mueller wrote:
> Have you read my earlier email explaning just that?
>
> > Tree-based methods are the only ones t
Have you read my earlier email explaning just that?
> Tree-based methods are the only ones that are invariant towards
feature scaling, do DecisionTree*, RandomForest*, ExtraTrees*, Bagging*
(with trees), GradientBoosting* (with trees).
For all other algorithms, the outcome will be different w
Thank you, Sebastian. This is what I want to understand. Considering the
final score, e.g., accuracy, does this mean that with scaling and without I
will get different results for NB and KNN? Or results will be the same like
in case of decision trees?
With gradient descent algorithms it is clear w
"Need" to be scaled sounds a little bit strong ;) -- feature scaling is really
context-dependend. If you are using stochastic gradient descent of gradient
descent you surely want to standardize your data or at least center it for
technical reasons and convergence. However, in naive Bayes, you ju
The result of scaled an non-scaled data will be different because the
regularization will have a different effect.
On 06/05/2015 03:10 AM, Yury Zhauniarovich wrote:
Thank you all! However, what Sturla wrote is now out of my understanding.
One more question. It seems also to me that Naive Bayes
Thank you all! However, what Sturla wrote is now out of my understanding.
One more question. It seems also to me that Naive Bayes classifiers also do
not need data to be scaled. Am I correct?
Best Regards,
Yury Zhauniarovich
On 4 June 2015 at 20:55, Sturla Molden wrote:
> On 04/06/15 20:38, S
On 04/06/15 20:38, Sturla Molden wrote:
> Component-wise EM (aka CEM2) is a better way of avoiding the singularity
> disease, though.
The traditional EM for a GMM proceeds like this:
while True:
global_estep(clusters)
for c in clusters:
mstep(c)
This is inherently unstable. Se
Apropros Mahalanobis distance: To get back to the initial question about scale
invariant classifiers ...
As Andreas already pointed out: The tree-based methods are scale invariant.
But under certain circumstances, you can also add Nearest Neighbor classifiers
and kernel methods such as kernel SV
On 04/06/15 20:18, Andreas Mueller wrote:
> I'm not sure what you mean by that. The cluster-memberships? The means
> and covariances will certainly be different.
We were talking about classification. Yes, memberships. Not means and
covariances, obviously.
-
On 04/06/15 20:18, Andreas Mueller wrote:
> I'm not sure what you mean by that. The cluster-memberships? The means
> and covariances will certainly be different.
The Mahalanobis distance will undo any linear scaling operation.
> They are actually somewhat regularized in scikit-learn, by having a
On 06/04/2015 02:04 PM, Sturla Molden wrote:
> On 04/06/15 17:15, Andreas Mueller wrote:
>
>> Tree-based methods are the only ones that are invariant towards feature
>> scaling, do DecisionTree*, RandomForest*, ExtraTrees*, Bagging* (with
>> trees), GradientBoosting* (with trees).
>>
>> For all o
On 04/06/15 17:15, Andreas Mueller wrote:
> Tree-based methods are the only ones that are invariant towards feature
> scaling, do DecisionTree*, RandomForest*, ExtraTrees*, Bagging* (with
> trees), GradientBoosting* (with trees).
>
> For all other algorithms, the outcome will be different whether
Tree-based methods are the only ones that are invariant towards feature
scaling, do DecisionTree*, RandomForest*, ExtraTrees*, Bagging* (with
trees), GradientBoosting* (with trees).
For all other algorithms, the outcome will be different whether you
scale your data or not.
For algorithms like
15 matches
Mail list logo