Re: [Scikit-learn-general] Hyperparameter tuning for Random Forest and Gradient boosting trees

Sebastian Raschka Fri, 29 Jan 2016 13:34:36 -0800

> Thanks for your reply. So this mean I should start with e.g. "max_depth": 
> [1,4,10,15], "min_samples_leaf":[1,10,20,30]. and if the max_depth=10 and 
> min_samples_leaf=10, then I should explore values close to these values. Am I 
> right?



Yes, this would work. However, keep in mind that you may be missing a "good" 
combination this way. And if you have a large number  of n_estimators, tuning a 
random forest can be "relatively" expensive. Plus, you'd typically don't want 
or need to prune the trees here, that's basically the whole idea behind RF.

> Shall I use small value of number of estimator, while conducting this 
> parametric study.After that I can use a higher value while fitting my model?

Also here, the parameters that you tuned may only be good for the model based 
on the specific number of estimators. In general, I would maybe advice against 
tuning the hyperparameters at all and use the computational time to increase 
the number of n_estimators.

> On Jan 29, 2016, at 4:18 PM, muhammad waseem <m.waseem.ah...@gmail.com> wrote:
> 
> Hi Sebastian,
> Thanks for your reply. So this mean I should start with e.g. "max_depth": 
> [1,4,10,15], "min_samples_leaf":[1,10,20,30]. and if the max_depth=10 and 
> min_samples_leaf=10, then I should explore values close to these values. Am I 
> right?
> 
> Shall I use small value of number of estimator, while conducting this 
> parametric study.After that I can use a higher value while fitting my model? 
> Will this change other parameters, meaning is n_estimator depends on other 
> parameters? 
> 
> Also, should I use early stopping while doing GridSearchCV?
> 
> Thanks again.
> Regards
> Waseem
> 
> On Fri, Jan 29, 2016 at 6:57 PM, Sebastian Raschka <se.rasc...@gmail.com 
> <mailto:se.rasc...@gmail.com>> wrote:
> Hi, Waseem,
> with a fine-enough grid, the GridSearchCV would be more "thorough" than the 
> randomized search. However, the problem is essentially some sort of 
> combinatorial explosion. Typically, I start with a "rougher" grid (the 
> different parameters are more "spaced out" relative to each other). After 
> that, I use a "finer" grid around the parameters that came up in the previous 
> search.
> However, it all comes down to computational time vs. being thorough. Or in 
> other words, grid search is an exhaustive search whereas randomized search is 
> a computationally "more efficient" approach.
> 
> 
> > On Jan 29, 2016, at 11:45 AM, muhammad waseem <m.waseem.ah...@gmail.com 
> > <mailto:m.waseem.ah...@gmail.com>> wrote:
> >
> > Hello All,
> > I am new to scikitlearn and ML, and trying to train my model using random 
> > forest and gradient boosting trees regressors. I was wondering what is the 
> > best way to do hyperparameter tuning, shall I use GridSearchCV or 
> > RandomisedSearchCV? I have read that the performance of RandomiseSeacrhCV 
> > is almost same as GridSearchCV (most of the times). If I go with 
> > RandomisedSearchCV then what should be the range of values for different 
> > parameters? How will I know that the range I am selecting is the correct 
> > one?
> >
> > Also, what about the number of estimators? In the GridSearchCV or 
> > RandomisedSearchCV, shall I start with a low value and then after selecting 
> > other parameters, I will choose a large number of estimators for fitting 
> > purposes. Am I right?
> >
> > Shall I always use early stopping, no matter if I use Grid search or 
> > Randomised Search?
> >
> > P.S: Training data: Number of Inputs = 6
> >                             Number fo Outputs = 1
> >                             Number of samples (rows) = 8526
> >          testing data: Number of samples (rows) = 1416
> >
> > Thanks
> > Kindest Regards
> > Waseem
> > ------------------------------------------------------------------------------
> > Site24x7 APM Insight: Get Deep Visibility into Application Performance
> > APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
> > Monitor end-to-end web transactions and take corrective actions now
> > Troubleshoot faster and improve end-user experience. Signup Now!
> > http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140_______________________________________________
> >  
> > <http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140_______________________________________________>
> > Scikit-learn-general mailing list
> > Scikit-learn-general@lists.sourceforge.net 
> > <mailto:Scikit-learn-general@lists.sourceforge.net>
> > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general 
> > <https://lists.sourceforge.net/lists/listinfo/scikit-learn-general>
> 
> 
> ------------------------------------------------------------------------------
> Site24x7 APM Insight: Get Deep Visibility into Application Performance
> APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
> Monitor end-to-end web transactions and take corrective actions now
> Troubleshoot faster and improve end-user experience. Signup Now!
> http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140 
> <http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140>
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net 
> <mailto:Scikit-learn-general@lists.sourceforge.net>
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general 
> <https://lists.sourceforge.net/lists/listinfo/scikit-learn-general>
> 
> ------------------------------------------------------------------------------
> Site24x7 APM Insight: Get Deep Visibility into Application Performance
> APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
> Monitor end-to-end web transactions and take corrective actions now
> Troubleshoot faster and improve end-user experience. Signup Now!
> http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140_______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140

_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Re: [Scikit-learn-general] Hyperparameter tuning for Random Forest and Gradient boosting trees

Reply via email to