[Scikit-learn-general] Feature selection

Herbert Schulz Thu, 28 May 2015 05:34:12 -0700

Hello,
I'm using scikit-learn for machine learning.
I have 800 samples with 2048 features, therefore i want to reduce my
features to get hopefully a better accuracy.


It is a multiclass problem (class 0-5), and the features consists of 1's
and 0's:  [1,0,0,0,1,1,1,1,1,0,0,0,0,0,0,0,0....,0]

I'm using the Randfom Forest Classifier.

Should i just feature select the training data ? And is it enough if I'm
using this code:

    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=.3)


clf=RandomForestClassifier(n_estimators=200,warm_start=True,criterion='gini',
max_depth=13)
    clf.fit(X_train, y_train).transform(X_train)

    predicted=clf.predict(X_test)
    expected=y_test
    confusionMatrix=metrics.confusion_matrix(expected,predicted)

Cause the accuracy didn't get higher. Is everything ok in the code or am I
doing something wrong?

I'll be very grateful for your help.

------------------------------------------------------------------------------

_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

[Scikit-learn-general] Feature selection

Reply via email to