Hi, I have a question related to the range of my input data for SVM or Random Forests for classification: I normalise my input vectors so that their euclidean norm is one, for instance to limit the influence of the image size or intensity contrast. I took the habit of then scaling them, multiplying them by a factor 1000 so that I have values between 0 and 1000 instead of 0 and 1, and thus less values "close to zero". I guess it does not hurt to do so, but would you know if it is useful? Do the SVM and Random Forests already do some normalisation before starting to learn the data?
I have a similar questions for the Random Forests for regression: how is the minimal MSE required for a split define? Here again, if I scale my input by a factor 1000, shall I expect the resulting trees to be different (excluding the random aspect of Random Forests)? Kind regards, Kevin ------------------------------------------------------------------------------ Learn Graph Databases - Download FREE O'Reilly Book "Graph Databases" is the definitive new guide to graph databases and their applications. Written by three acclaimed leaders in the field, this first edition is now available. Download your free book today! http://p.sf.net/sfu/13534_NeoTech _______________________________________________ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general