[ 
https://issues.apache.org/jira/browse/SPARK-4607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14237029#comment-14237029
 ] 

Kai Sasaki edited comment on SPARK-4607 at 12/7/14 2:34 AM:
------------------------------------------------------------

[~josephkb] I think each trees in iterations of GrandientBoostedTrees are 
always trained all training data. Is there any case when we have to do 
subsampling with making RandomForest? Current GrandientBoostedTrees code uses 
non subsampling RandomForest. 


was (Author: lewuathe):
[~josephkb] I think each trees in iterations of GrandientBoostedTrees is always 
trained all training data. Is there any case when we have to do subsampling 
with making RandomForest? Current GrandientBoostedTrees code uses non 
subsampling RandomForest. 

> Add random seed to GradientBoostedTrees
> ---------------------------------------
>
>                 Key: SPARK-4607
>                 URL: https://issues.apache.org/jira/browse/SPARK-4607
>             Project: Spark
>          Issue Type: Improvement
>          Components: MLlib
>    Affects Versions: 1.2.0
>            Reporter: Joseph K. Bradley
>            Priority: Minor
>
> Gradient Boosted Trees does not take a random seed, but it uses randomness if 
> the subsampling rate is < 1.  It should take a random seed parameter.
> This update will also help to make unit tests more stable by allowing 
> determinism (using a small set of fixed random seeds).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to