Hello,
I'm using scikit-learn for machine learning.
I have 800 samples with 2048 features, therefore i want to reduce my
features to get hopefully a better accuracy.
It is a multiclass problem (class 0-5), and the features consists of 1's
and 0's: [1,0,0,0,1,1,1,1,1,0,0,0,0,0,0,0,0....,0]
I'm using the Randfom Forest Classifier.
Should i just feature select the training data ? And is it enough if I'm
using this code:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=.3)
clf=RandomForestClassifier(n_estimators=200,warm_start=True,criterion='gini',
max_depth=13)
clf.fit(X_train, y_train).transform(X_train)
predicted=clf.predict(X_test)
expected=y_test
confusionMatrix=metrics.confusion_matrix(expected,predicted)
Cause the accuracy didn't get higher. Is everything ok in the code or am I
doing something wrong?
I'll be very grateful for your help.
------------------------------------------------------------------------------
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general