[ https://issues.apache.org/jira/browse/SPARK-1548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14344037#comment-14344037 ]
Joseph K. Bradley commented on SPARK-1548: ------------------------------------------ I think the idea is to train a different tree on each worker, rather than using the distributed tree learning algorithm. That could be generalized to any algorithm, but it's a bit different than the bootstrapping JIRA, which seems to be about a wrapper for algorithms. > Add Partial Random Forest algorithm to MLlib > -------------------------------------------- > > Key: SPARK-1548 > URL: https://issues.apache.org/jira/browse/SPARK-1548 > Project: Spark > Issue Type: New Feature > Components: MLlib > Affects Versions: 1.0.0 > Reporter: Manish Amde > Assignee: Frank Dai > > This task involves creating an alternate approximate random forest > implementation where each tree is constructed per partition. > The tasks involves: > - Justifying with theory and experimental results why this algorithm is a > good choice. > - Comparing the various tradeoffs and finalizing the algorithm before > implementation > - Code implementation > - Unit tests > - Functional tests > - Performance tests > - Documentation -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org