Hi all, In my classification problem, some features are numerical (e.g. 10.1, 1), and some features are categorical though numerically coded as nonnegative numbers (such as id coded as 100, 99), and some features are ordered though numerically coded as nonnegative numbers(such as versions 12, 13, 4 ).
Do the attribute feature_importances_ calculated by RandomForestClassifier().fit() work with my feature types? Can it work with all the feature types, except categorical features coded as numbers? Does the chi squared test in sklearn.feature_selection.chi2(X, y) work with my feature types? What types of features can it work with and what can it not? Can it only work with categorical features and ordered features, not numerical features? Does the test by sklearn.feature_selection.f_classif(X, y) work with my feature types? Can it only work with numerical features, not categorical or ordered features? The above three ways all return measurements and ranking of the features. But I wonder if the results can be reliable due to different feature types. What do you suggest me to do feature selection and feature ranking in my problem? Thanks, Tim ------------------------------------------------------------------------------ One dashboard for servers and applications across Physical-Virtual-Cloud Widest out-of-the-box monitoring support with 50+ applications Performance metrics, stats and reports that give you Actionable Insights Deep dive visibility with transaction tracing using APM Insight. http://ad.doubleclick.net/ddm/clk/290420510;117567292;y _______________________________________________ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general