You will not get results close to ensembles with pruning (unless your dataset is very specific).
You can probably do your node filtering on ensembles, too.


On 08/30/2015 03:44 PM, Rex X wrote:
Jacob, I agree with both of your points about the ensemble methods. They can give quite good prediction result.

But the question is to interpret these models. We want to extract specific decision rules, as fraud transaction declining rules for example. The motivation is to port these rules to other systems.

Currently I am searching each node of one tree, to filter these nodes satisfying the conditions I want. I did obtained some interesting result. I wish that I can obtain result close to ensemble method.

Any further tips?


Best,
Rex



On Sun, Aug 30, 2015 at 11:45 AM, Jacob Schreiber <jmschreibe...@gmail.com <mailto:jmschreibe...@gmail.com>> wrote:

    Usually one would use an ensemble of trees to prevent overfitting.
    Two common techniques are a Random Forest or Gradient Boosting
    Trees. Gradient Boosting in particular has done well in
    competitions recently.

    While this may give you better generalization, it becomes
    difficult to interpret these models. You can try to constrain your
    model by requiring a higher number of examples, or higher weight
    of examples, be present at each leaf. This will prevent the tree
    from splitting to accomodate a single point, which may cause
    overfitting.

    On Sun, Aug 30, 2015 at 10:37 AM, Rex X <dnsr...@gmail.com
    <mailto:dnsr...@gmail.com>> wrote:

        Hi Jacob,

        Is there anything we can do to get better generalized decision
        rules?

        For example, after one tree fitting, select top (N-1) features
        by feature_importance, and then do the fitting again.

        Can this be helpful?


        Best,
        Rex




        On Sun, Aug 30, 2015 at 8:07 AM, Jacob Schreiber
        <jmschreibe...@gmail.com <mailto:jmschreibe...@gmail.com>> wrote:

            Tree pruning is currently not supported in sklearn.

            On Sun, Aug 30, 2015 at 6:44 AM, Rex X <dnsr...@gmail.com
            <mailto:dnsr...@gmail.com>> wrote:

                Tree pruning process is very important to get a better
                decision tree.

                One idea is to recursively remove the leaf node which
                cause least hurt to the decision tree.

                Any idea how to do this for the following sample case?


                    import pandas as pd
                    from sklearn.datasets import load_iris
                    from sklearn import tree
                    import sklearn

                    iris = sklearn.datasets.load_iris()
                    clf = tree.DecisionTreeClassifier(class_weight={0
                    : 0.30, 1: 0.3, 2:0.4}, max_features="auto")
                    clf.fit(iris.data, iris.target)


                
------------------------------------------------------------------------------

                _______________________________________________
                Scikit-learn-general mailing list
                Scikit-learn-general@lists.sourceforge.net
                <mailto:Scikit-learn-general@lists.sourceforge.net>
                
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general



            
------------------------------------------------------------------------------

            _______________________________________________
            Scikit-learn-general mailing list
            Scikit-learn-general@lists.sourceforge.net
            <mailto:Scikit-learn-general@lists.sourceforge.net>
            https://lists.sourceforge.net/lists/listinfo/scikit-learn-general



        
------------------------------------------------------------------------------

        _______________________________________________
        Scikit-learn-general mailing list
        Scikit-learn-general@lists.sourceforge.net
        <mailto:Scikit-learn-general@lists.sourceforge.net>
        https://lists.sourceforge.net/lists/listinfo/scikit-learn-general



    
------------------------------------------------------------------------------

    _______________________________________________
    Scikit-learn-general mailing list
    Scikit-learn-general@lists.sourceforge.net
    <mailto:Scikit-learn-general@lists.sourceforge.net>
    https://lists.sourceforge.net/lists/listinfo/scikit-learn-general




------------------------------------------------------------------------------


_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

------------------------------------------------------------------------------
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to