Usually one would use an ensemble of trees to prevent overfitting. Two
common techniques are a Random Forest or Gradient Boosting Trees. Gradient
Boosting in particular has done well in competitions recently.

While this may give you better generalization, it becomes difficult to
interpret these models. You can try to constrain your model by requiring a
higher number of examples, or higher weight of examples, be present at each
leaf. This will prevent the tree from splitting to accomodate a single
point, which may cause overfitting.

On Sun, Aug 30, 2015 at 10:37 AM, Rex X <dnsr...@gmail.com> wrote:

> Hi Jacob,
>
> Is there anything we can do to get better generalized decision rules?
>
> For example, after one tree fitting, select top (N-1) features by
> feature_importance, and then do the fitting again.
>
> Can this be helpful?
>
>
> Best,
> Rex
>
>
>
>
> On Sun, Aug 30, 2015 at 8:07 AM, Jacob Schreiber <jmschreibe...@gmail.com>
> wrote:
>
>> Tree pruning is currently not supported in sklearn.
>>
>> On Sun, Aug 30, 2015 at 6:44 AM, Rex X <dnsr...@gmail.com> wrote:
>>
>>> Tree pruning process is very important to get a better decision tree.
>>>
>>> One idea is to recursively remove the leaf node which cause least hurt
>>> to the decision tree.
>>>
>>> Any idea how to do this for the following sample case?
>>>
>>>
>>> import pandas as pd
>>>> from sklearn.datasets import load_iris
>>>> from sklearn import tree
>>>> import sklearn
>>>>
>>>> iris = sklearn.datasets.load_iris()
>>>> clf = tree.DecisionTreeClassifier(class_weight={0 : 0.30, 1: 0.3,
>>>> 2:0.4}, max_features="auto")
>>>> clf.fit(iris.data, iris.target)
>>>>
>>>
>>>
>>> ------------------------------------------------------------------------------
>>>
>>> _______________________________________________
>>> Scikit-learn-general mailing list
>>> Scikit-learn-general@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>>
>>>
>>
>>
>> ------------------------------------------------------------------------------
>>
>> _______________________________________________
>> Scikit-learn-general mailing list
>> Scikit-learn-general@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>
>>
>
>
> ------------------------------------------------------------------------------
>
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
>
------------------------------------------------------------------------------
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to