Hi, Herbert, I can't help you with the accuracy problem since this can be due to many different things. However, there is now a way to combine different classifiers for majority rule voting, the sklearn.ensemble.VotingClassifier (. It is not in the current stable release yet but you could get it from the scikit-learn dev version from github.
Alternatively, if you don't want to install the scikit-learn dev version, you could use the EnsembleClassifier from mlxtend until the next stable release of scikit-learn -- slightly different syntax but the same principle http://rasbt.github.io/mlxtend/docs/sklearn/ensemble_classifier/ <http://rasbt.github.io/mlxtend/docs/sklearn/ensemble_classifier/> (this is basically the original implementation that was later ported to scikit-learn). Hope that helps. Best, Sebastian > On Jun 2, 2015, at 11:25 AM, Herbert Schulz <hrbrt....@gmail.com> wrote: > > Thanks that helped. > > But i just can't get an higher accuracy then 45%... don't now why. also with > logicstic regression and so on.. > > Is there a way to combine for example an SVM with a decision tree? > > Herb > > On 2 June 2015 at 11:19, Michael Eickenberg <michael.eickenb...@gmail.com > <mailto:michael.eickenb...@gmail.com>> wrote: > Some configurations are not implemented or difficult to evaluate in the dual. > Setting dual=True/False doesn't change the result, so please don't vary it as > you would vary other parameters. It can however sometimes yield a speed-up. > Here you should try setting dual=False as a first means of debugging. > > Michael > > On Tue, Jun 2, 2015 at 11:04 AM, Herbert Schulz <hrbrt....@gmail.com > <mailto:hrbrt....@gmail.com>> wrote: > Does anyone know why this failure occurs? > > ValueError: Unsupported set of arguments: loss='l1' and > penalty='squared_hinge'are not supported when dual=True, Parameters: > penalty='l1', loss='squared_hinge', dual=True > > I'm using a Linear SVC ( in andreas example code). > > > On 1 June 2015 at 13:38, Herbert Schulz <hrbrt....@gmail.com > <mailto:hrbrt....@gmail.com>> wrote: > Cool, thx for that! > > > Herb > > On 1 June 2015 at 12:16, JAGANADH G <jagana...@gmail.com > <mailto:jagana...@gmail.com>> wrote: > Hi > > I have listed sklearn feature selection with minimal examples here > > > http://nbviewer.ipython.org/github/jaganadhg/data_science_notebooks/blob/master/sklearn/scikit_learn_feature_selection.ipynb > > <http://nbviewer.ipython.org/github/jaganadhg/data_science_notebooks/blob/master/sklearn/scikit_learn_feature_selection.ipynb> > > > Jagan > > On Thu, May 28, 2015 at 10:14 PM, Herbert Schulz <hrbrt....@gmail.com > <mailto:hrbrt....@gmail.com>> wrote: > Thank's to both of you!!! I realy appreciate it! I will try everything this > weekend. > > Best regards, > > Herb > > On 28 May 2015 at 18:21, Sebastian Raschka <se.rasc...@gmail.com > <mailto:se.rasc...@gmail.com>> wrote: > I agree with Andreas, > typically, a large number of features also shouldn't be a big problem for > random forests in my experience; however, it of course depends on the number > of trees and training samples. > > If you suspect that overfitting might be a problem using unregularized > classifiers, also consider "dimensionality reduction"/"feature exctraction" > techniques to compress the feature space, e.g., linear or kernel PCA, or > other methods listed in the manifold learning section on the scikit-website. > > However, there are scenarios where you'd want to keep the "original" features > (in contrast to e.g., principal components), and there are scenarios where > linear methods such as LinearSVC(penalty='l1') may not work so well (e.g., > for non-linear problems). The optimal solution would be to exhaustively test > all feature combinations to see which works best, however, this can be quite > costly. For demonstration purposes, I implemented "sequential backward > selection" > (http://rasbt.github.io/mlxtend/docs/sklearn/sequential_backward_selection/ > <http://rasbt.github.io/mlxtend/docs/sklearn/sequential_backward_selection/>) > some time ago; a simple greedy alternative to the exhaustive search, maybe > you are lucky and it works well in your case? . When I find time after my > summer projects, I am planning to implement some genetic algos for feature > selection... > > Best, > Sebastian > > >> On May 28, 2015, at 11:59 AM, Andreas Mueller <t3k...@gmail.com >> <mailto:t3k...@gmail.com>> wrote: >> >> Hi Herbert. >> 1) Often reducing the features space does not help with accuracy, and using >> a regularized classifier leads to better results. >> 2) To do feature selection, you need two methods: one to reduce the set of >> features, another that does the actual supervised task (classification here). >> >> Have you tried just using the standard classifiers? Clearly you tried the >> RF, but I'd also try a linear method like LinearSVC/LogisticRegression or a >> kernel SVC. >> >> If you want to do feature selection, what you need to do is something like >> this: >> >> feature_selector = LinearSVC(penalty='l1') #or maybe start with >> SelectKBest() >> feature_selector.train(X_train, y_train) >> >> X_train_reduced = feature_selector.transform(X_train) >> X_test_reduced = feature_selector.transform(X_test) >> >> classifier = RandomForestClassifier().fit(X_train_reduced, y_train) >> >> prediction = classifier.predict(X_test_reduced) >> >> >> Or you use a pipeline, as here: >> http://scikit-learn.org/dev/auto_examples/feature_selection/feature_selection_pipeline.html >> >> <http://scikit-learn.org/dev/auto_examples/feature_selection/feature_selection_pipeline.html> >> Maybe we should add a version without the pipeline to the examples? >> >> Cheers, >> Andy >> >> >> >> On 05/28/2015 08:32 AM, Herbert Schulz wrote: >>> Hello, >>> I'm using scikit-learn for machine learning. >>> I have 800 samples with 2048 features, therefore i want to reduce my >>> features to get hopefully a better accuracy. >>> >>> It is a multiclass problem (class 0-5), and the features consists of 1's >>> and 0's: [1,0,0,0,1,1,1,1,1,0,0,0,0,0,0,0,0....,0] >>> >>> I'm using the Randfom Forest Classifier. >>> >>> Should i just feature select the training data ? And is it enough if I'm >>> using this code: >>> >>> X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=.3) >>> >>> >>> clf=RandomForestClassifier(n_estimators=200,warm_start=True,criterion='gini', >>> max_depth=13) >>> clf.fit(X_train, y_train).transform(X_train) >>> >>> predicted=clf.predict(X_test) >>> expected=y_test >>> confusionMatrix=metrics.confusion_matrix(expected,predicted) >>> >>> Cause the accuracy didn't get higher. Is everything ok in the code or am I >>> doing something wrong? >>> >>> I'll be very grateful for your help. >>> >>> >>> >>> >>> >>> ------------------------------------------------------------------------------ >>> >>> >>> _______________________________________________ >>> Scikit-learn-general mailing list >>> Scikit-learn-general@lists.sourceforge.net >>> <mailto:Scikit-learn-general@lists.sourceforge.net> >>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general >>> <https://lists.sourceforge.net/lists/listinfo/scikit-learn-general> >> >> ------------------------------------------------------------------------------ >> _______________________________________________ >> Scikit-learn-general mailing list >> Scikit-learn-general@lists.sourceforge.net >> <mailto:Scikit-learn-general@lists.sourceforge.net> >> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general >> <https://lists.sourceforge.net/lists/listinfo/scikit-learn-general> > > > ------------------------------------------------------------------------------ > > _______________________________________________ > Scikit-learn-general mailing list > Scikit-learn-general@lists.sourceforge.net > <mailto:Scikit-learn-general@lists.sourceforge.net> > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general > <https://lists.sourceforge.net/lists/listinfo/scikit-learn-general> > > > > ------------------------------------------------------------------------------ > > _______________________________________________ > Scikit-learn-general mailing list > Scikit-learn-general@lists.sourceforge.net > <mailto:Scikit-learn-general@lists.sourceforge.net> > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general > <https://lists.sourceforge.net/lists/listinfo/scikit-learn-general> > > > > > -- > ********************************** > JAGANADH G > http://jaganadhg.in <http://jaganadhg.in/> > ILUGCBE > http://ilugcbe.org.in <http://ilugcbe.org.in/> > > ------------------------------------------------------------------------------ > > _______________________________________________ > Scikit-learn-general mailing list > Scikit-learn-general@lists.sourceforge.net > <mailto:Scikit-learn-general@lists.sourceforge.net> > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general > <https://lists.sourceforge.net/lists/listinfo/scikit-learn-general> > > > > > ------------------------------------------------------------------------------ > > _______________________________________________ > Scikit-learn-general mailing list > Scikit-learn-general@lists.sourceforge.net > <mailto:Scikit-learn-general@lists.sourceforge.net> > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general > <https://lists.sourceforge.net/lists/listinfo/scikit-learn-general> > > > > ------------------------------------------------------------------------------ > > _______________________________________________ > Scikit-learn-general mailing list > Scikit-learn-general@lists.sourceforge.net > <mailto:Scikit-learn-general@lists.sourceforge.net> > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general > <https://lists.sourceforge.net/lists/listinfo/scikit-learn-general> > > > ------------------------------------------------------------------------------ > _______________________________________________ > Scikit-learn-general mailing list > Scikit-learn-general@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
_______________________________________________ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general