Maybe some of the tree huggers can say something about that ;) Below are
my best guess.
I am surprised to see that the docs say no regularization is usually best.
I would not use such large upper bounds as you did, and I would never
search the full range, but rather steps to get only a few candidates,
and possibly refine later.
You can do max_depth=None and see how large fully grown trees are and
start from there.
I think the CV method should not really impact the parameters so much as
it is not even a factor of 2 difference in n_samples.
On 09/27/2014 02:56 PM, Satrajit Ghosh wrote:
thanks andy.
are there any general heuristics for these parameters - given that
their ranges are over the samples?
max_depth = range(1, nsamples)
or
min_samples_leaves = range(1, nsamples)
also related question: given that nsamples would actually depend on
the cv method of the GridSearchCV, is there a way to specify possible
ranges without trying to calculate what the CV method would do?
i.e is there a way to couple the parameter specification to when the
grid search runs the internal CV such that it will limit the parameter
based on the size of the internal training set?
cheers,
satra
On Sat, Sep 27, 2014 at 2:25 AM, Andy <t3k...@gmail.com
<mailto:t3k...@gmail.com>> wrote:
Hi Satra.
You should set "n_estimators" as high as you can afford time and
memory wise, and then cross-validate over (at least) one of the
regularization parameters,
for example over max_depth or min_samples_leaves. You can also
search over max_features.
Cheers,
Andy
On 09/26/2014 10:24 PM, Satrajit Ghosh wrote:
hi folks,
what are some useful ranges of parameters to throw into a grid
search? and are there specific difference between randomforests
and extra trees? i understand one could try different impurity
measures for classification, but any suggestions on sensitivity
of other parameters would be nice.
cheers,
satra
On Thu, Sep 25, 2014 at 8:48 AM, Andy <t3k...@gmail.com
<mailto:t3k...@gmail.com>> wrote:
On 09/23/2014 11:50 PM, Pagliari, Roberto wrote:
I’m a bit confused as to why gridsearchCV is not needed with
random forests. I understand that with RF, each tree will
only get to see a partial representation of the data.
Why do you say GridSearchCV is not needed?
I think it should always be used, only not for setting
n_estimators.
You can use the oob estimates, but actually I don't think we
have an automated way to use these to adjust parameters.
------------------------------------------------------------------------------
Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer
Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI
DSS Reports
Are you Audit-Ready for PCI DSS 3.0 Compliance? Download
White paper
Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog
Analyzer
http://pubads.g.doubleclick.net/gampad/clk?id=154622311&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
<mailto:Scikit-learn-general@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer
Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports
Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper
Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer
http://pubads.g.doubleclick.net/gampad/clk?id=154622311&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
<mailto:Scikit-learn-general@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer
Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS
Reports
Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper
Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer
http://pubads.g.doubleclick.net/gampad/clk?id=154622311&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
<mailto:Scikit-learn-general@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer
Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports
Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper
Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer
http://pubads.g.doubleclick.net/gampad/clk?id=154622311&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
Slashdot TV. Videos for Nerds. Stuff that Matters.
http://pubads.g.doubleclick.net/gampad/clk?id=160591471&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general