Re: [Scikit-learn-general] normalising/scaling input for SVM or Random Forests

2014-03-19 Thread Lars Buitinck
2014-03-19 21:40 GMT+01:00 Satrajit Ghosh : > this would mean that any tree-based model could generate differences based > on preprocessing differences right? Yes. I'm not sure why the threshold is there, but it's probably to prevent generating too many splits in the face of noisy input. A cleaner

Re: [Scikit-learn-general] normalising/scaling input for SVM or Random Forests

2014-03-19 Thread Satrajit Ghosh
thanks lars. this would mean that any tree-based model could generate differences based on preprocessing differences right? cheers, satra On Sun, Mar 16, 2014 at 3:37 PM, Olivier Grisel wrote: > 2014-03-16 0:23 GMT+01:00 Lars Buitinck : > > 2014-03-15 21:53 GMT+01:00 Satrajit Ghosh : > >> in m

Re: [Scikit-learn-general] normalising/scaling input for SVM or Random Forests

2014-03-16 Thread Olivier Grisel
2014-03-16 0:23 GMT+01:00 Lars Buitinck : > 2014-03-15 21:53 GMT+01:00 Satrajit Ghosh : >> in many cases with fat data (small samples<50 x many features>10) i have >> found that standardizing helps quite a bit in case of extra trees. i still >> don't have a good understanding as to why this is

Re: [Scikit-learn-general] normalising/scaling input for SVM or Random Forests

2014-03-15 Thread Lars Buitinck
2014-03-15 21:53 GMT+01:00 Satrajit Ghosh : > in many cases with fat data (small samples<50 x many features>10) i have > found that standardizing helps quite a bit in case of extra trees. i still > don't have a good understanding as to why this is the case. it could simply > be small sample bia

Re: [Scikit-learn-general] normalising/scaling input for SVM or Random Forests

2014-03-15 Thread Satrajit Ghosh
thanks gilles, that makes sense. i haven't checked random forest classification on these data. i'll check that as well. cheers, satra On Sat, Mar 15, 2014 at 5:51 PM, Gilles Louppe wrote: > Hi Satra, > > In case of Extra-Trees, changing the scale of features might change > the result when th

Re: [Scikit-learn-general] normalising/scaling input for SVM or Random Forests

2014-03-15 Thread Gilles Louppe
Hi Satra, In case of Extra-Trees, changing the scale of features might change the result when the transform you apply distorts the original feature space. Drawing a threshold uniformly at random in the original [min;max] interval won't be equivalent to drawing a threshold in [f(min);f(max)] if f i

Re: [Scikit-learn-general] normalising/scaling input for SVM or Random Forests

2014-03-15 Thread Satrajit Ghosh
hi olivier, just a question on this statement: Random Forest (and decision tree-based models in general) are scale > independent. > in many cases with fat data (small samples<50 x many features>10) i have found that standardizing helps quite a bit in case of extra trees. i still don't have a

Re: [Scikit-learn-general] normalising/scaling input for SVM or Random Forests

2014-03-15 Thread Kevin Keraudren
Thanks a lot for this detailed answer! Kind regards, Kevin Le 14/03/2014 16:37, Olivier Grisel a écrit : > 2014-03-14 15:34 GMT+01:00 Kevin Keraudren : >> Hi, >> >> I have a question related to the range of my input data for SVM or >> Random Forests for classification: >> I normalise my input vec

Re: [Scikit-learn-general] normalising/scaling input for SVM or Random Forests

2014-03-14 Thread Olivier Grisel
2014-03-14 15:34 GMT+01:00 Kevin Keraudren : > Hi, > > I have a question related to the range of my input data for SVM or > Random Forests for classification: > I normalise my input vectors so that their euclidean norm is one, for > instance to limit the influence of the image size or intensity con

[Scikit-learn-general] normalising/scaling input for SVM or Random Forests

2014-03-14 Thread Kevin Keraudren
Hi, I have a question related to the range of my input data for SVM or Random Forests for classification: I normalise my input vectors so that their euclidean norm is one, for instance to limit the influence of the image size or intensity contrast. I took the habit of then scaling them, multipl