Andreas,

Can we do ensembles with *pruning* in scikit-learn?


Rex

On Mon, Aug 31, 2015 at 9:15 AM, Andreas Mueller <t3k...@gmail.com> wrote:

> You will not get results close to ensembles with pruning (unless your
> dataset is very specific).
> You can probably do your node filtering on ensembles, too.
>
>
>
> On 08/30/2015 03:44 PM, Rex X wrote:
>
> Jacob, I agree with both of your points about the ensemble methods. They
> can give quite good prediction result.
>
> But the question is to interpret these models. We want to extract specific
> decision rules, as fraud transaction declining rules for example. The
> motivation is to port these rules to other systems.
>
> Currently I am searching each node of one tree, to filter these nodes
> satisfying the conditions I want. I did obtained some interesting result. I
> wish that I can obtain result close to ensemble method.
>
> Any further tips?
>
>
> Best,
> Rex
>
>
>
> On Sun, Aug 30, 2015 at 11:45 AM, Jacob Schreiber <
> <jmschreibe...@gmail.com>jmschreibe...@gmail.com> wrote:
>
>> Usually one would use an ensemble of trees to prevent overfitting. Two
>> common techniques are a Random Forest or Gradient Boosting Trees. Gradient
>> Boosting in particular has done well in competitions recently.
>>
>> While this may give you better generalization, it becomes difficult to
>> interpret these models. You can try to constrain your model by requiring a
>> higher number of examples, or higher weight of examples, be present at each
>> leaf. This will prevent the tree from splitting to accomodate a single
>> point, which may cause overfitting.
>>
>> On Sun, Aug 30, 2015 at 10:37 AM, Rex X < <dnsr...@gmail.com>
>> dnsr...@gmail.com> wrote:
>>
>>> Hi Jacob,
>>>
>>> Is there anything we can do to get better generalized decision rules?
>>>
>>> For example, after one tree fitting, select top (N-1) features by
>>> feature_importance, and then do the fitting again.
>>>
>>> Can this be helpful?
>>>
>>>
>>> Best,
>>> Rex
>>>
>>>
>>>
>>>
>>> On Sun, Aug 30, 2015 at 8:07 AM, Jacob Schreiber <
>>> <jmschreibe...@gmail.com>jmschreibe...@gmail.com> wrote:
>>>
>>>> Tree pruning is currently not supported in sklearn.
>>>>
>>>> On Sun, Aug 30, 2015 at 6:44 AM, Rex X < <dnsr...@gmail.com>
>>>> dnsr...@gmail.com> wrote:
>>>>
>>>>> Tree pruning process is very important to get a better decision tree.
>>>>>
>>>>> One idea is to recursively remove the leaf node which cause least hurt
>>>>> to the decision tree.
>>>>>
>>>>> Any idea how to do this for the following sample case?
>>>>>
>>>>>
>>>>> import pandas as pd
>>>>>> from sklearn.datasets import load_iris
>>>>>> from sklearn import tree
>>>>>> import sklearn
>>>>>>
>>>>>> iris = sklearn.datasets.load_iris()
>>>>>> clf = tree.DecisionTreeClassifier(class_weight={0 : 0.30, 1: 0.3,
>>>>>> 2:0.4}, max_features="auto")
>>>>>> clf.fit(iris.data, iris.target)
>>>>>>
>>>>>
>>>>>
>>>>> ------------------------------------------------------------------------------
>>>>>
>>>>> _______________________________________________
>>>>> Scikit-learn-general mailing list
>>>>> <Scikit-learn-general@lists.sourceforge.net>
>>>>> Scikit-learn-general@lists.sourceforge.net
>>>>> <https://lists.sourceforge.net/lists/listinfo/scikit-learn-general>
>>>>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>>>>
>>>>>
>>>>
>>>>
>>>> ------------------------------------------------------------------------------
>>>>
>>>> _______________________________________________
>>>> Scikit-learn-general mailing list
>>>> <Scikit-learn-general@lists.sourceforge.net>
>>>> Scikit-learn-general@lists.sourceforge.net
>>>> <https://lists.sourceforge.net/lists/listinfo/scikit-learn-general>
>>>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>>>
>>>>
>>>
>>>
>>> ------------------------------------------------------------------------------
>>>
>>> _______________________________________________
>>> Scikit-learn-general mailing list
>>> Scikit-learn-general@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>>
>>>
>>
>>
>> ------------------------------------------------------------------------------
>>
>> _______________________________________________
>> Scikit-learn-general mailing list
>> Scikit-learn-general@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>
>>
>
>
> ------------------------------------------------------------------------------
>
>
>
> _______________________________________________
> Scikit-learn-general mailing 
> listScikit-learn-general@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
>
>
>
> ------------------------------------------------------------------------------
>
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
>
------------------------------------------------------------------------------
Monitor Your Dynamic Infrastructure at Any Scale With Datadog!
Get real-time metrics from all of your servers, apps and tools
in one place.
SourceForge users - Click here to start your Free Trial of Datadog now!
http://pubads.g.doubleclick.net/gampad/clk?id=241902991&iu=/4140
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to