Question 1: It does not do an internal cross-validation to prevent
overfitting.
Question 2: Yes, you can put a higher weight on your positive class. Look
at the class_weights parameter in the documentation here:
http://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html

On Tue, Mar 1, 2016 at 3:11 PM, Mamun Rashid <mamunbabu2...@gmail.com>
wrote:

> Hi All,
>
> This is my understanding of the Random Forest Algorithm :
> Random Forest algorithm creates number of trees using randomly selected
> subset of samples and features. At each node of the tree it uses the Gini
> information gain
> to find the best feature-threshold (various threshold is tested for each
> feature) pair to obtain the best separation between the positive and the
> negative class.
>
> Question 1 :
> I have a two class classification problem where the positive labels reside
> in clusters. A traditional cross validation approach is not aware of this
> issue and splits data
> points from a cluster in to training and test set giving rise to strong
> classification performance. I wrote a custom cross validation loop to
> address this issue. However
> the bootstrapping method inside the Random Forest algorithm
> randomly selects samples and features and controls for overfitting.
>
> When it applies the fit method on randomly selected samples, does it do
> an internal cross validation to prevent overfitting ? I did not find this
> in the github code.
> If yes, Can I specify my groupings to Random Forest ?
>
> Question 2 :
> Gini impurity at each node tries to find the best separation between two
> classes. I care more about obtaining a cleaner separation for my positive
> class. Is there
> any way to give importance to one class during the partitioning.
>
> Thanks in advance.
>
> Mamun
>
>
>
>
>
> ------------------------------------------------------------------------------
> Site24x7 APM Insight: Get Deep Visibility into Application Performance
> APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
> Monitor end-to-end web transactions and take corrective actions now
> Troubleshoot faster and improve end-user experience. Signup Now!
> http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
>
------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to