Re: [Scikit-learn-general] Hyperparameter tuning for Random Forest and Gradient boosting trees

muhammad waseem Fri, 29 Jan 2016 13:46:24 -0800

I meant, how I make sure that I don't miss the "Good" combination that you
mentioned?


Also, for second point: Maybe considering computational time and then
making sure that I have enough number of estimators in the parametric
study?

On Fri, Jan 29, 2016 at 9:38 PM, muhammad waseem <m.waseem.ah...@gmail.com>
wrote:

>
> Thanks for your reply. So this mean I should start with e.g. "max_depth":
>> [1,4,10,15], "min_samples_leaf":[1,10,20,30]. and if the max_depth=10 and
>> min_samples_leaf=10, then I should explore values close to these values. Am
>> I right?
>>
>>
>> Yes, this would work. However, keep in mind that you may be missing a
>> "good" combination this way. And if you have a large number  of
>> n_estimators, tuning a random forest can be "relatively" expensive. Plus,
>> you'd typically don't want or need to prune the trees here, that's
>> basically the whole idea behind RF.
>>
>
> So I make sure that I don't miss the "Good" combination?
>
>>
>> Shall I use small value of number of estimator, while conducting this
>> parametric study.After that I can use a higher value while fitting my model?
>>
>>
>> Also here, the parameters that you tuned may only be good for the model
>> based on the specific number of estimators. In general, I would maybe
>> advice against tuning the hyperparameters at all and use the computational
>> time to increase the number of n_estimators.
>>
>
> Maybe considering computational time and then making sure that I have
> enough number of estimators in the parametric study?
>
>>
>> On Jan 29, 2016, at 4:18 PM, muhammad waseem <m.waseem.ah...@gmail.com>
>> wrote:
>>
>> Hi Sebastian,
>> Thanks for your reply. So this mean I should start with e.g. "max_depth":
>> [1,4,10,15], "min_samples_leaf":[1,10,20,30]. and if the max_depth=10 and
>> min_samples_leaf=10, then I should explore values close to these values. Am
>> I right?
>>
>> Shall I use small value of number of estimator, while conducting this
>> parametric study.After that I can use a higher value while fitting my
>> model? Will this change other parameters, meaning is n_estimator depends on
>> other parameters?
>>
>> Also, should I use early stopping while doing GridSearchCV?
>>
>> Thanks again.
>> Regards
>> Waseem
>>
>> On Fri, Jan 29, 2016 at 6:57 PM, Sebastian Raschka <se.rasc...@gmail.com>
>> wrote:
>>
>>> Hi, Waseem,
>>> with a fine-enough grid, the GridSearchCV would be more "thorough" than
>>> the randomized search. However, the problem is essentially some sort of
>>> combinatorial explosion. Typically, I start with a "rougher" grid (the
>>> different parameters are more "spaced out" relative to each other). After
>>> that, I use a "finer" grid around the parameters that came up in the
>>> previous search.
>>> However, it all comes down to computational time vs. being thorough. Or
>>> in other words, grid search is an exhaustive search whereas randomized
>>> search is a computationally "more efficient" approach.
>>>
>>>
>>> > On Jan 29, 2016, at 11:45 AM, muhammad waseem <
>>> m.waseem.ah...@gmail.com> wrote:
>>> >
>>> > Hello All,
>>> > I am new to scikitlearn and ML, and trying to train my model using
>>> random forest and gradient boosting trees regressors. I was wondering what
>>> is the best way to do hyperparameter tuning, shall I use GridSearchCV or
>>> RandomisedSearchCV? I have read that the performance of RandomiseSeacrhCV
>>> is almost same as GridSearchCV (most of the times). If I go with
>>> RandomisedSearchCV then what should be the range of values for different
>>> parameters? How will I know that the range I am selecting is the correct
>>> one?
>>> >
>>> > Also, what about the number of estimators? In the GridSearchCV or
>>> RandomisedSearchCV, shall I start with a low value and then after selecting
>>> other parameters, I will choose a large number of estimators for fitting
>>> purposes. Am I right?
>>> >
>>> > Shall I always use early stopping, no matter if I use Grid search or
>>> Randomised Search?
>>> >
>>> > P.S: Training data: Number of Inputs = 6
>>> >                             Number fo Outputs = 1
>>> >                             Number of samples (rows) = 8526
>>> >          testing data: Number of samples (rows) = 1416
>>> >
>>> > Thanks
>>> > Kindest Regards
>>> > Waseem
>>> >
>>> ------------------------------------------------------------------------------
>>> > Site24x7 APM Insight: Get Deep Visibility into Application Performance
>>> > APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
>>> > Monitor end-to-end web transactions and take corrective actions now
>>> > Troubleshoot faster and improve end-user experience. Signup Now!
>>> >
>>> http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140_______________________________________________
>>> > Scikit-learn-general mailing list
>>> > Scikit-learn-general@lists.sourceforge.net
>>> > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>>
>>>
>>>
>>> ------------------------------------------------------------------------------
>>> Site24x7 APM Insight: Get Deep Visibility into Application Performance
>>> APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
>>> Monitor end-to-end web transactions and take corrective actions now
>>> Troubleshoot faster and improve end-user experience. Signup Now!
>>> http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140
>>> _______________________________________________
>>> Scikit-learn-general mailing list
>>> Scikit-learn-general@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>>
>>
>>
>> ------------------------------------------------------------------------------
>> Site24x7 APM Insight: Get Deep Visibility into Application Performance
>> APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
>> Monitor end-to-end web transactions and take corrective actions now
>> Troubleshoot faster and improve end-user experience. Signup Now!
>>
>> http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140_______________________________________________
>> Scikit-learn-general mailing list
>> Scikit-learn-general@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>
>>
>>
>>
>> ------------------------------------------------------------------------------
>> Site24x7 APM Insight: Get Deep Visibility into Application Performance
>> APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
>> Monitor end-to-end web transactions and take corrective actions now
>> Troubleshoot faster and improve end-user experience. Signup Now!
>> http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140
>> _______________________________________________
>> Scikit-learn-general mailing list
>> Scikit-learn-general@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>
>>
>
>
> --
> Dr Muhammad Waseem Ahmad
> Research Associate,
> BRE Center for Sustainable Construction,
>
> School of Engineering,
>
> Cardiff University,
>
> Cardiff, UK.
>



-- 
Dr Muhammad Waseem Ahmad
Research Associate,
BRE Center for Sustainable Construction,

School of Engineering,

Cardiff University,

Cardiff, UK.

------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140

_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Re: [Scikit-learn-general] Hyperparameter tuning for Random Forest and Gradient boosting trees

Reply via email to