Some configurations are not implemented or difficult to evaluate in the dual. Setting dual=True/False doesn't change the result, so please don't vary it as you would vary other parameters. It can however sometimes yield a speed-up. Here you should try setting dual=False as a first means of debugging.
Michael On Tue, Jun 2, 2015 at 11:04 AM, Herbert Schulz <hrbrt....@gmail.com> wrote: > Does anyone know why this failure occurs? > > ValueError: Unsupported set of arguments: loss='l1' and > penalty='squared_hinge'are not supported when dual=True, Parameters: > penalty='l1', loss='squared_hinge', dual=True > > I'm using a Linear SVC ( in andreas example code). > > > On 1 June 2015 at 13:38, Herbert Schulz <hrbrt....@gmail.com> wrote: > >> Cool, thx for that! >> >> >> Herb >> >> On 1 June 2015 at 12:16, JAGANADH G <jagana...@gmail.com> wrote: >> >>> Hi >>> >>> I have listed sklearn feature selection with minimal examples here >>> >>> >>> http://nbviewer.ipython.org/github/jaganadhg/data_science_notebooks/blob/master/sklearn/scikit_learn_feature_selection.ipynb >>> >>> Jagan >>> >>> On Thu, May 28, 2015 at 10:14 PM, Herbert Schulz <hrbrt....@gmail.com> >>> wrote: >>> >>>> Thank's to both of you!!! I realy appreciate it! I will try everything >>>> this weekend. >>>> >>>> Best regards, >>>> >>>> Herb >>>> >>>> On 28 May 2015 at 18:21, Sebastian Raschka <se.rasc...@gmail.com> >>>> wrote: >>>> >>>>> I agree with Andreas, >>>>> typically, a large number of features also shouldn't be a big problem >>>>> for random forests in my experience; however, it of course depends on the >>>>> number of trees and training samples. >>>>> >>>>> If you suspect that overfitting might be a problem using unregularized >>>>> classifiers, also consider "dimensionality reduction"/"feature >>>>> exctraction" >>>>> techniques to compress the feature space, e.g., linear or kernel PCA, or >>>>> other methods listed in the manifold learning section on the >>>>> scikit-website. >>>>> >>>>> However, there are scenarios where you'd want to keep the "original" >>>>> features (in contrast to e.g., principal components), and there are >>>>> scenarios where linear methods such as LinearSVC(penalty='l1') may not >>>>> work >>>>> so well (e.g., for non-linear problems). The optimal solution would be to >>>>> exhaustively test all feature combinations to see which works best, >>>>> however, this can be quite costly. For demonstration purposes, I >>>>> implemented "sequential backward selection" ( >>>>> http://rasbt.github.io/mlxtend/docs/sklearn/sequential_backward_selection/) >>>>> some time ago; a simple greedy alternative to the exhaustive search, maybe >>>>> you are lucky and it works well in your case? . When I find time after my >>>>> summer projects, I am planning to implement some genetic algos for feature >>>>> selection... >>>>> >>>>> Best, >>>>> Sebastian >>>>> >>>>> >>>>> On May 28, 2015, at 11:59 AM, Andreas Mueller <t3k...@gmail.com> >>>>> wrote: >>>>> >>>>> Hi Herbert. >>>>> 1) Often reducing the features space does not help with accuracy, and >>>>> using a regularized classifier leads to better results. >>>>> 2) To do feature selection, you need two methods: one to reduce the >>>>> set of features, another that does the actual supervised task >>>>> (classification here). >>>>> >>>>> Have you tried just using the standard classifiers? Clearly you tried >>>>> the RF, but I'd also try a linear method like LinearSVC/LogisticRegression >>>>> or a kernel SVC. >>>>> >>>>> If you want to do feature selection, what you need to do is something >>>>> like this: >>>>> >>>>> feature_selector = LinearSVC(penalty='l1') #or maybe start with >>>>> SelectKBest() >>>>> feature_selector.train(X_train, y_train) >>>>> >>>>> X_train_reduced = feature_selector.transform(X_train) >>>>> X_test_reduced = feature_selector.transform(X_test) >>>>> >>>>> classifier = RandomForestClassifier().fit(X_train_reduced, y_train) >>>>> >>>>> prediction = classifier.predict(X_test_reduced) >>>>> >>>>> >>>>> Or you use a pipeline, as here: >>>>> http://scikit-learn.org/dev/auto_examples/feature_selection/feature_selection_pipeline.html >>>>> Maybe we should add a version without the pipeline to the examples? >>>>> >>>>> Cheers, >>>>> Andy >>>>> >>>>> >>>>> >>>>> On 05/28/2015 08:32 AM, Herbert Schulz wrote: >>>>> >>>>> Hello, >>>>> I'm using scikit-learn for machine learning. >>>>> I have 800 samples with 2048 features, therefore i want to reduce my >>>>> features to get hopefully a better accuracy. >>>>> >>>>> It is a multiclass problem (class 0-5), and the features consists of >>>>> 1's and 0's: [1,0,0,0,1,1,1,1,1,0,0,0,0,0,0,0,0....,0] >>>>> >>>>> I'm using the Randfom Forest Classifier. >>>>> >>>>> Should i just feature select the training data ? And is it enough if >>>>> I'm using this code: >>>>> >>>>> X_train, X_test, y_train, y_test = train_test_split(X, y, >>>>> test_size=.3) >>>>> >>>>> >>>>> clf=RandomForestClassifier(n_estimators=200,warm_start=True,criterion='gini', >>>>> max_depth=13) >>>>> clf.fit(X_train, y_train).transform(X_train) >>>>> >>>>> predicted=clf.predict(X_test) >>>>> expected=y_test >>>>> confusionMatrix=metrics.confusion_matrix(expected,predicted) >>>>> >>>>> Cause the accuracy didn't get higher. Is everything ok in the code or >>>>> am I doing something wrong? >>>>> >>>>> I'll be very grateful for your help. >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> ------------------------------------------------------------------------------ >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> Scikit-learn-general mailing >>>>> listScikit-learn-general@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/scikit-learn-general >>>>> >>>>> >>>>> >>>>> ------------------------------------------------------------------------------ >>>>> _______________________________________________ >>>>> Scikit-learn-general mailing list >>>>> Scikit-learn-general@lists.sourceforge.net >>>>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general >>>>> >>>>> >>>>> >>>>> >>>>> ------------------------------------------------------------------------------ >>>>> >>>>> _______________________________________________ >>>>> Scikit-learn-general mailing list >>>>> Scikit-learn-general@lists.sourceforge.net >>>>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general >>>>> >>>>> >>>> >>>> >>>> ------------------------------------------------------------------------------ >>>> >>>> _______________________________________________ >>>> Scikit-learn-general mailing list >>>> Scikit-learn-general@lists.sourceforge.net >>>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general >>>> >>>> >>> >>> >>> -- >>> ********************************** >>> JAGANADH G >>> http://jaganadhg.in >>> *ILUGCBE* >>> http://ilugcbe.org.in >>> >>> >>> ------------------------------------------------------------------------------ >>> >>> _______________________________________________ >>> Scikit-learn-general mailing list >>> Scikit-learn-general@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general >>> >>> >> > > > ------------------------------------------------------------------------------ > > _______________________________________________ > Scikit-learn-general mailing list > Scikit-learn-general@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general > >
------------------------------------------------------------------------------
_______________________________________________ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general