Hello, I'm using scikit-learn for machine learning. I have 800 samples with 2048 features, therefore i want to reduce my features to get hopefully a better accuracy.
It is a multiclass problem (class 0-5), and the features consists of 1's and 0's: [1,0,0,0,1,1,1,1,1,0,0,0,0,0,0,0,0....,0] I'm using the Randfom Forest Classifier. Should i just feature select the training data ? And is it enough if I'm using this code: X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=.3) clf=RandomForestClassifier(n_estimators=200,warm_start=True,criterion='gini', max_depth=13) clf.fit(X_train, y_train).transform(X_train) predicted=clf.predict(X_test) expected=y_test confusionMatrix=metrics.confusion_matrix(expected,predicted) Cause the accuracy didn't get higher. Is everything ok in the code or am I doing something wrong? I'll be very grateful for your help.
------------------------------------------------------------------------------
_______________________________________________ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general