thanks lars.
this would mean that any tree-based model could generate differences based
on preprocessing differences right?
cheers,
satra
On Sun, Mar 16, 2014 at 3:37 PM, Olivier Grisel olivier.gri...@ensta.orgwrote:
2014-03-16 0:23 GMT+01:00 Lars Buitinck larsm...@gmail.com:
2014-03-15
2014-03-19 21:40 GMT+01:00 Satrajit Ghosh sa...@mit.edu:
this would mean that any tree-based model could generate differences based
on preprocessing differences right?
Yes. I'm not sure why the threshold is there, but it's probably to
prevent generating too many splits in the face of noisy
2014-03-16 0:23 GMT+01:00 Lars Buitinck larsm...@gmail.com:
2014-03-15 21:53 GMT+01:00 Satrajit Ghosh sa...@mit.edu:
in many cases with fat data (small samples50 x many features10) i have
found that standardizing helps quite a bit in case of extra trees. i still
don't have a good
Thanks a lot for this detailed answer!
Kind regards,
Kevin
Le 14/03/2014 16:37, Olivier Grisel a écrit :
2014-03-14 15:34 GMT+01:00 Kevin Keraudren kevin.keraudre...@imperial.ac.uk:
Hi,
I have a question related to the range of my input data for SVM or
Random Forests for classification:
I
hi olivier,
just a question on this statement:
Random Forest (and decision tree-based models in general) are scale
independent.
in many cases with fat data (small samples50 x many features10) i
have found that standardizing helps quite a bit in case of extra trees. i
still don't have a
Hi Satra,
In case of Extra-Trees, changing the scale of features might change
the result when the transform you apply distorts the original feature
space. Drawing a threshold uniformly at random in the original
[min;max] interval won't be equivalent to drawing a threshold in
[f(min);f(max)] if f
thanks gilles,
that makes sense. i haven't checked random forest classification on these
data. i'll check that as well.
cheers,
satra
On Sat, Mar 15, 2014 at 5:51 PM, Gilles Louppe g.lou...@gmail.com wrote:
Hi Satra,
In case of Extra-Trees, changing the scale of features might change
the
2014-03-14 15:34 GMT+01:00 Kevin Keraudren kevin.keraudre...@imperial.ac.uk:
Hi,
I have a question related to the range of my input data for SVM or
Random Forests for classification:
I normalise my input vectors so that their euclidean norm is one, for
instance to limit the influence of the